R: Data Augmentation/Bayesian IPF Algorithm for Restricted...
dabipf.mix
R Documentation
Data Augmentation/Bayesian IPF Algorithm for Restricted General Location
Models
Description
Markov Chain Monte Carlo method for generating posterior draws of the
parameters of the unrestricted general location model, given a matrix
of incomplete mixed data. After a suitable number of steps are taken,
the resulting value of the parameter may be regarded as a random draw
from its observed-data posterior distribution. May be used together
with imp.mix to create multiple imputations
of the missing data.
summary list of an incomplete data matrix created by the
function prelim.mix.
margins
vector describing the sufficient configurations or margins in the
desired loglinear model. The variables are ordered in the original
order of the columns of x, so that 1 refers to x[,1],
2 refers to
x[,2], and so on. A margin is described by the factors not summed
over, and margins are separated by zeros. Thus c(1,2,0,2,3,0,1,3)
would indicate the (1,2), (2,3), and (1,3) margins in a three-way
table, i.e., the model of no three-way association.
design
design matrix specifying the relationship of the continuous
variables to the categorical ones. The dimension is c(D,r) where
D is the number of cells in the contingency table, and r is the
number of effects which must be less than or equal to D. The
order of the rows corresponds to the storage order of the cell
probabilities in the contingency table; see getparam.mix for
details.
start
starting value of the parameter. This is a parameter list
such as one created by this function or by ecm.mix.
steps
number of steps of data augmentation-Bayesian IPF to be taken.
prior
Optional vector or array of hyperparameter(s) for a Dirichlet prior
distribution. The default is the Jeffreys prior (all hyperparameters
= .5). If structural zeros appear in the table, prior counts for these
cells should be set to NA.
showits
if TRUE, reports the iterations so the user can monitor the
progress of the algorithm.
Details
The prior distribution used by this function is a combination of a
constrained Dirichlet prior for the cell probabilities, an improper
uniform prior for the regression coefficients, and the improper Jeffreys
prior for the covariance matrix. The posterior distribution is not
guaranteed to exist, especially in sparse-data situations. If this
seems to be a problem, then better results may be obtained by imposing
restrictions further restrictions on the parameters.
Value
a new parameter list. The parameter can be put into a more
understandable format by the function getparam.mix.
Note
The random number generator seed must be set at least once by the
function rngseed before this function can be used.
The starting value should satisfy the restrictions of the model and
should lie in the interior of the parameter space. A suitable starting
value can be obtained by running ecm.mix,
possibly with the prior
hyperparameters set to some value greater than 1, to ensure that the
mode lies in the interior.
References
Schafer, J. L. (1996) Analysis of Incomplete Multivariate Data.
Chapman & Hall, Chapter 9.