R: Estimating a contingency table using model-based approaches
ObtainModelEstimates
R Documentation
Estimating a contingency table using model-based approaches
Description
This function provides several alternative estimating methods to
the IPFP when estimating a multiway table subject to known constrains/totals:
maximum likelihood method (ML), minimum chi-squared (CHI2) and weighted least
squares (WLSQ). Note that the resulting estimators are probabilities.
The covariance matrix of the estimated proportions (as defined by Little and Wu,
1991) are also provided. Also in the case of the ML method, the covariance
matrix defined by Lang (2004) is also returned.
The initial multi-dimensional array to be updated. Each cell must
be non-negative.
target.list
A list of the target margins provided in target.data. Each component
of the list is an array whose cells indicates which dimension the
corresponding margin relates to.
target.data
A list containing the data of the target margins. Each
component of the list is an array storing a margin.
The list order must follow the one defined in target.list.
Note that the cells of the arrays must be non-negative.
method
Determine the model to be used for estimating the contingency
table. By default the method is ml (maximum likelihood); other
options available are chi2 (minimum chi-squared) and lsq
(least squares).
tol.margins
Tolerance for the margins consistency. Default is 1e-10.
replace.zeros
Constant that is added to zero cell found in the seed,
as procedures require strictly positive cells. Default value is
1e-10.
...
Additional parameters that can be passed to control the
optimisation process (see solnp from the package
Rsolnp).
Value
A list containing the final estimated table as well as the covariance matrix of
the estimated proportion and other convergence informations.
x.hat
Array of the estimated table frequencies.
p.hat
Array of the estimated table probabilities.
error.margins
For each list element of target.data, check.margins shows the
maximum absolute deviation between the element and the corresponding
estimated margin. Note that the deviations should approximate zero,
otherwise the target margins are not met.
solnp.res
The estimation process uses the solnp optimisation function from
the R package Rsolnp and solnp.res is the corresponding object
returned by the solver.
conv
A boolean indicating whether the algorithm converged to a solution.
method
The selected method for estimation.
call
The matched call.
Note
It is important to note that if the margins given in target.list are
not consistent (i.e. the sums of their cells are not equals), the input data
is then normalised by considering probabilities instead of frequencies:
the cells of the seed are divided by sum(seed);
the cells of each margin i of the list target.data are
divided by sum(target.data[[i]]).
Author(s)
Thomas Suesse
Maintainer: Johan Barthelemy <johan@uow.edu.au>.
References
Lang, J.B. (2004)
Multinomial-Poisson homogeneous models for contingency tables.
Annals of Statistics 32(1): 340-383.
Little, R. J., Wu, M. M. (1991)
Models for contingency tables with known margins when target and sampled
populations differ.
Journal of the American Statistical Association 86 (413): 87-95.
See Also
solnp function documentation of the package
Rsolnp for the details of the solnp.res object
returned by the function.
Examples
# set-up an initial 3-way table of dimension (2 x 2 x 2)
seed <- Vector2Array(c(80, 60, 20, 20, 40, 35, 35, 30), dim = c(c(2, 2, 2)))
# building target margins
margins12 <- c(2000, 1000, 1500, 1800)
margins12.array <- Vector2Array(margins12, dim=c(2, 2))
margins3 <- c(4000,2300)
margins3.array <- Vector2Array(margins3, dim = 2)
target.list <- list(c(1, 2), 3)
target.data <- list(margins12.array, margins3.array)
# estimating the new contingency table using the ml method
results.ml <- ObtainModelEstimates(seed, target.list, target.data,
compute.cov = TRUE)
print(results.ml)
# estimating the new contingency table using the chi2 method
results.chi2 <- ObtainModelEstimates(seed, target.list, target.data,
method = "chi2", compute.cov = TRUE)
print(results.chi2)
# estimating the new contingency table using the lsq method
results.lsq <- ObtainModelEstimates(seed, target.list, target.data,
method = "lsq", compute.cov = TRUE)
print(results.lsq)