R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Wald, Log-likelihood ratio and Person Chi-square statistics...

gof.estimates

R Documentation

Wald, Log-likelihood ratio and Person Chi-square statistics for mipfp object

Description

This method computes three statistics to perform a test wheter the seed agrees with the target data. The statistics are the Wilk's log-likelihood ratio statistic, the Wald statistic and the Person Chi-square statistic.

The method also returns the associated degrees of freedom.

Usage

## S3 method for class 'mipfp'
gof.estimates(object, seed = NULL, target.data = NULL, 
              target.list = NULL, replace.zeros = 1e-10, ...)

Arguments

`object`	The object of class `mipfp` containing.
`seed`	The seed used to compute the estimates (optional). If not provided, the method tries to determine the `seed` automatically.
`target.data`	A list containing the data of the target margins. Each component of the list is an array storing a margin. The list order must follow the one defined in `target.list`. Note that the cells of the arrays must be non-negative (and can even be NA if `method = ipfp`) (optional). If not provided, the method tries to dermine `target.data` automatically.
`target.list`	A list of the target margins provided in `target.data`. Each component of the list is an array whose cells indicates which dimension the corresponding margin relates to (optional). If not provided, the method tries to determine `target.list` automatically.
`replace.zeros`	If 0-cells are to be found, then they are replaced with this value.
`...`	Not used.

Details

The test is formally expressed as:

H0 : h(p.seed) = 0 vs H1 : h(p.seed) != 0

where p.seed is the vector of the seed probabilities and h(x) = t(A) * x - m with A and m being respectively the marginal matrix and the margins vector of the estimation problem.

The three statistics are then defined as:

Wilk's log-likelihoold ratio

G2 = 2 * ∑ ( x.seed * ln (p.seed / p.hat) )
Wald's statistic

W2 = t(h(x)) * inv(t(H) * diag(x.seed) * H) * h(x.seed)
Pearson Chi-square

X2 = t(x.seed - n * p.hat) * inv(diag(n * p.hat)) * (x.seed - n * p.hat)

where x.seed is the vectorization of the seed, n = sum(x.seed), diag(v) is a diagonal matrix derived from the vector v and H denotes the Jacobian evaluated in p.hat (the vector of the estimated probabilities) of the function h(x).

The degrees of freedom for these statistics corresponds to the number of components in m.

Value

A list whose elements are detailed below.

`G2`	The Log-likelihood statistic.
`W2`	The Wald statistic.
`X2`	The Pearson chi-squared statistic.
`stats.df`	The degrees of freedom for the `G2`, `W2` and `X2` statistics.

Author(s)

Johan Barthelemy

Maintainer: Johan Barthelemy johan@uow.edu.au.

References

Lang, J.B. (2004) Multinomial-Poisson homogeneous models for contingency tables. Annals of Statistics 32(1): 340-383.

Examples

# loading the data
data(spnamur, package = "mipfp")
# subsetting the data frame, keeping only the first 3 variables
spnamur.sub <- subset(spnamur, select = Household.type:Prof.status)
# true table
true.table <- table(spnamur.sub)
# extracting the margins
tgt.v1        <- apply(true.table, 1, sum)
tgt.v1.v2     <- apply(true.table, c(1,2), sum)
tgt.v2.v3     <- apply(true.table, c(2,3), sum)
tgt.list.dims <- list(1, c(1,2), c(2,3))
tgt.data      <- list(tgt.v1, tgt.v1.v2, tgt.v2.v3)
# creating the seed, a 10 pct sample of spnamur
seed.df <- spnamur.sub[sample(nrow(spnamur), round(0.10*nrow(spnamur))), ]
seed.table <- table(seed.df)
# applying one fitting method (ipfp)
r.ipfp <- Estimate(seed=seed.table, target.list=tgt.list.dims, 
                   target.data = tgt.data)
# printing the G2, X2 and W2 statistics
print(gof.estimates(r.ipfp))
# alternative way (pretty printing, with p-values)
print(summary(r.ipfp)$stats.gof)