Object sweets is the raw dataset; objects sweets_table
and sweets_tally are processed versions which are easier to
analyze.
The father of a certain family brings home nine sweets of type
mm and nine sweets of type jb each day for 21 days to
his children, AMH, ZJH, and AGH.
The children share the sweets amongst themselves in such a way that
each child receives exactly 6 sweets.
Array sweets has dimension c(2,3,21): 2 types of
sweets, 3 children, and 21 days. Thus sweets[,,1] shows that on
the first day, AMH chose 0 sweets of type mm and 6
sweets of type jb; child ZJH chose 3 of each, and child
AGH chose 6 sweets of type mm and 0 sweets of type
jb.
Observe the constant marginal totals: the kids have the same overall
number of sweets each, and there are a fixed number of each kind of
sweet.
Array sweets_array has dimension c(2,3,37): 2
sweets, 3 children, and 37 possible ways of arranging a matrix with
the specified marginal totals. This can be produced by
allboards() of the aylmer package.
sweets_table is a dataframe with six columns, one for
each combination of child and sweet, and 37 rows, each row showing a
permissible arrangement. All possibilities are present. The six
entries of sweets[,,1] correspond to the six elements of
sweets_table[1,]; the column names are mnemonics.
sweets_tally shows how often each of the arrangements in
sweets_tally was observed (that is, it's a table of the 21
observations in sweets)
Source
The Hankin family
Examples
data(sweets)
# show correspondence between sweets_table and sweets_tally:
cbind(sweets_table, sweets_tally)
# Sum the data, by sweet and child and test:
fisher.test(apply(sweets,1:2,sum))
# Not significant!
# Now test for overdispersion.
# First set up the regressors:
jj1 <- apply(sweets_array,3,tcrossprod)
jj2 <- apply(sweets_array,3, crossprod)
dim(jj1) <- c(2,2,37)
dim(jj2) <- c(3,3,37)
theta_xy <- jj1[1,2,]
phi_ab <- jj2[1,2,]
phi_ac <- jj2[1,3,]
phi_bc <- jj2[2,3,]
# Now the offset:
Off <- apply(sweets_array,3,function(x){-sum(lfactorial(x))})
# Now the formula:
f <- formula(sweets_tally~ -1 + theta_xy + phi_ab + phi_ac + phi_bc)
# Now the Lindsey Poisson device:
out <- glm(formula=f, offset=Off, family=poisson)
summary(out)
# See how the residual deviance is comparable with the degrees of freedom
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(MM)
Loading required package: magic
Loading required package: abind
Loading required package: partitions
Loading required package: emulator
Loading required package: mvtnorm
Loading required package: Oarray
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/MM/sweets.Rd_%03d_medium.png", width=480, height=480)
> ### Name: sweets
> ### Title: Synthetic dataset due to Hankin
> ### Aliases: sweets sweets_tally sweets_table sweets_array
> ### Keywords: datasets
>
> ### ** Examples
>
> data(sweets)
>
> # show correspondence between sweets_table and sweets_tally:
> cbind(sweets_table, sweets_tally)
AMH_mm AMH_jb ZJH_mm ZJH_jb AGH_mm AGH_jb sweets_tally
1 0 6 3 3 6 0 1
2 0 6 4 2 5 1 0
3 0 6 5 1 4 2 0
4 0 6 6 0 3 3 2
5 1 5 2 4 6 0 1
6 1 5 3 3 5 1 0
7 1 5 4 2 4 2 0
8 1 5 5 1 3 3 1
9 1 5 6 0 2 4 0
10 2 4 1 5 6 0 1
11 2 4 2 4 5 1 0
12 2 4 3 3 4 2 0
13 2 4 4 2 3 3 0
14 2 4 5 1 2 4 0
15 2 4 6 0 1 5 0
16 3 3 0 6 6 0 3
17 3 3 1 5 5 1 0
18 3 3 2 4 4 2 0
19 3 3 3 3 3 3 0
20 3 3 4 2 2 4 0
21 3 3 5 1 1 5 0
22 3 3 6 0 0 6 4
23 4 2 0 6 5 1 0
24 4 2 1 5 4 2 0
25 4 2 2 4 3 3 0
26 4 2 3 3 2 4 0
27 4 2 4 2 1 5 0
28 4 2 5 1 0 6 0
29 5 1 0 6 4 2 0
30 5 1 1 5 3 3 1
31 5 1 2 4 2 4 1
32 5 1 3 3 1 5 0
33 5 1 4 2 0 6 0
34 6 0 0 6 3 3 3
35 6 0 1 5 2 4 0
36 6 0 2 4 1 5 0
37 6 0 3 3 0 6 3
>
> # Sum the data, by sweet and child and test:
> fisher.test(apply(sweets,1:2,sum))
Fisher's Exact Test for Count Data
data: apply(sweets, 1:2, sum)
p-value = 0.2408
alternative hypothesis: two.sided
> # Not significant!
>
>
>
>
> # Now test for overdispersion.
> # First set up the regressors:
>
> jj1 <- apply(sweets_array,3,tcrossprod)
> jj2 <- apply(sweets_array,3, crossprod)
> dim(jj1) <- c(2,2,37)
> dim(jj2) <- c(3,3,37)
>
> theta_xy <- jj1[1,2,]
> phi_ab <- jj2[1,2,]
> phi_ac <- jj2[1,3,]
> phi_bc <- jj2[2,3,]
>
> # Now the offset:
> Off <- apply(sweets_array,3,function(x){-sum(lfactorial(x))})
>
> # Now the formula:
> f <- formula(sweets_tally~ -1 + theta_xy + phi_ab + phi_ac + phi_bc)
>
> # Now the Lindsey Poisson device:
> out <- glm(formula=f, offset=Off, family=poisson)
>
> summary(out)
Call:
glm(formula = f, family = poisson, offset = Off)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.2322 -1.0071 -0.3001 0.1001 2.1276
Coefficients:
Estimate Std. Error z value Pr(>|z|)
theta_xy -1.54215 0.11374 -13.56 <2e-16 ***
phi_ab 0.87572 0.03835 22.83 <2e-16 ***
phi_ac 0.88047 0.03804 23.14 <2e-16 ***
phi_bc 0.85525 0.03988 21.44 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 666.028 on 37 degrees of freedom
Residual deviance: 27.152 on 33 degrees of freedom
AIC: 62.007
Number of Fisher Scoring iterations: 6
> # See how the residual deviance is comparable with the degrees of freedom
>
>
>
>
>
>
> dev.off()
null device
1
>