R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Data Augmentation/Bayesian IPF Algorithm for Restricted...

dabipf.mix

R Documentation

Data Augmentation/Bayesian IPF Algorithm for Restricted General Location Models

Description

Markov Chain Monte Carlo method for generating posterior draws of the parameters of the unrestricted general location model, given a matrix of incomplete mixed data. After a suitable number of steps are taken, the resulting value of the parameter may be regarded as a random draw from its observed-data posterior distribution. May be used together with imp.mix to create multiple imputations of the missing data.

Usage

dabipf.mix(s, margins, design, start, steps=1, prior=0.5, 
           showits=FALSE)

Arguments

`s`	summary list of an incomplete data matrix created by the function `prelim.mix`.
`margins`	vector describing the sufficient configurations or margins in the desired loglinear model. The variables are ordered in the original order of the columns of `x`, so that 1 refers to `x[,1]`, 2 refers to `x[,2]`, and so on. A margin is described by the factors not summed over, and margins are separated by zeros. Thus c(1,2,0,2,3,0,1,3) would indicate the (1,2), (2,3), and (1,3) margins in a three-way table, i.e., the model of no three-way association.
`design`	design matrix specifying the relationship of the continuous variables to the categorical ones. The dimension is `c(D,r)` where D is the number of cells in the contingency table, and r is the number of effects which must be less than or equal to D. The order of the rows corresponds to the storage order of the cell probabilities in the contingency table; see `getparam.mix` for details.
`start`	starting value of the parameter. This is a parameter list such as one created by this function or by `ecm.mix`.
`steps`	number of steps of data augmentation-Bayesian IPF to be taken.
`prior`	Optional vector or array of hyperparameter(s) for a Dirichlet prior distribution. The default is the Jeffreys prior (all hyperparameters = .5). If structural zeros appear in the table, prior counts for these cells should be set to `NA`.
`showits`	if `TRUE`, reports the iterations so the user can monitor the progress of the algorithm.

Details

The prior distribution used by this function is a combination of a constrained Dirichlet prior for the cell probabilities, an improper uniform prior for the regression coefficients, and the improper Jeffreys prior for the covariance matrix. The posterior distribution is not guaranteed to exist, especially in sparse-data situations. If this seems to be a problem, then better results may be obtained by imposing restrictions further restrictions on the parameters.

Value

a new parameter list. The parameter can be put into a more understandable format by the function getparam.mix.

Note

The random number generator seed must be set at least once by the function rngseed before this function can be used.

The starting value should satisfy the restrictions of the model and should lie in the interior of the parameter space. A suitable starting value can be obtained by running ecm.mix, possibly with the prior hyperparameters set to some value greater than 1, to ensure that the mode lies in the interior.

References

Schafer, J. L. (1996) Analysis of Incomplete Multivariate Data. Chapman & Hall, Chapter 9.

Examples

data(stlouis)
s <- prelim.mix(stlouis,3)      # do preliminary manipulations
margins <- c(1,2,3)       # saturated contingency table model
design <- diag(rep(1,12))  # identity matrix  D=no of cells
thetahat <- ecm.mix(s,margins,design) # find ML estimate
rngseed(1234567)       # random generator seed
newtheta <- dabipf.mix(s,margins,design,thetahat,steps=200)
ximp <- imp.mix(s,newtheta,stlouis)   # impute under newtheta