R: Markov Chain Monte Carlo for Ordinal Data Factor Analysis...
MCMCordfactanal
R Documentation
Markov Chain Monte Carlo for Ordinal Data Factor Analysis Model
Description
This function generates a sample from the posterior distribution of an
ordinal data factor analysis model. Normal priors are assumed on the factor
loadings and factor scores while improper uniform priors are assumed
on the cutpoints. The user supplies data and parameters for the prior
distributions, and a sample from the posterior distribution is returned as
an mcmc object, which can be subsequently analyzed with
functions provided in the coda package.
Either a formula or a numeric matrix containing the
manifest variables.
factors
The number of factors to be fitted.
lambda.constraints
List of lists specifying possible equality
or simple inequality constraints on the factor loadings. A typical
entry in the list has one of three forms: varname=list(d,c) which
will constrain the dth loading for the variable named varname to
be equal to c, varname=list(d,"+") which will constrain the dth
loading for the variable named varname to be positive, and
varname=list(d, "-") which will constrain the dth loading for the
variable named varname to be negative. If x is a matrix without
column names defaults names of “V1", “V2", ... , etc will be
used. Note that, unlike MCMCfactanal, the
Lambda matrix used here has factors+1
columns. The first column of Lambda corresponds to
negative item difficulty parameters and should generally not be
constrained.
data
A data frame.
burnin
The number of burn-in iterations for the sampler.
mcmc
The number of iterations for the sampler.
thin
The thinning interval used in the simulation. The number of
iterations must be divisible by this value.
tune
The tuning parameter for the Metropolis-Hastings
sampling. Can be either a scalar or a k-vector. Must be
strictly positive.
verbose
A switch which determines whether or not the progress of
the sampler is printed to the screen. If verbose is greater
than 0 the iteration number and
the Metropolis-Hastings acceptance rate are printed to the screen
every verboseth iteration.
seed
The seed for the random number generator. If NA, the Mersenne
Twister generator is used with default seed 12345; if an integer is
passed it is used to seed the Mersenne twister. The user can also
pass a list of length two to use the L'Ecuyer random number generator,
which is suitable for parallel computation. The first element of the
list is the L'Ecuyer seed, which is a vector of length six or NA (if NA
a default seed of rep(12345,6) is used). The second element of
list is a positive substream number. See the MCMCpack
specification for more details.
lambda.start
Starting values for the factor loading matrix
Lambda. If lambda.start is set to a scalar the starting value for
all unconstrained loadings will be set to that scalar. If
lambda.start is a matrix of the same dimensions as Lambda then the
lambda.start matrix is used as the starting values (except
for equality-constrained elements). If lambda.start is set to
NA (the default) then starting values for unconstrained
elements in the first column of Lambda are based on the observed
response pattern, the remaining unconstrained elements of Lambda are
set to , and starting values for inequality constrained elements are
set to either 1.0 or -1.0 depending on the nature of the constraints.
l0
The means of the independent Normal prior on the factor
loadings. Can be either a scalar or a matrix with the same
dimensions as Lambda.
L0
The precisions (inverse variances) of the independent Normal
prior on the factor loadings. Can be either a scalar or a matrix with
the same dimensions as Lambda.
store.lambda
A switch that determines whether or not to store
the factor loadings for posterior analysis. By default, the factor
loadings are all stored.
store.scores
A switch that determines whether or not to
store the factor scores for posterior analysis.
NOTE: This takes an enormous amount of memory, so
should only be used if the chain is thinned heavily, or for
applications with a small number of observations. By default, the
factor scores are not stored.
drop.constantvars
A switch that determines whether or not
manifest variables that have no variation should be deleted
before fitting the model. Default = TRUE.
...
further arguments to be passed
Details
The model takes the following form:
Let 1=1,...,n index observations and
j=1,...,K index response variables within an
observation. The typical observed
variable x_ij is ordinal with a total of C_j
categories. The distribution of X is governed by a N by K matrix of latent variables Xstar and a
series of cutpoints gamma. Xstar is assumed
to be generated according to:
xstar_i = Lambda phi_i +
epsilon_i
epsilon_i ~ N(0, I)
where xstar_i is the k-vector of latent variables
specific to observation i, Lambda is the
k by d matrix of factor loadings, and
phi_i is
the d-vector of latent factor scores. It is assumed that the
first element of phi_i is equal to 1 for all
i.
The probability that the jth variable in observation
i takes the value c is:
The standard two-parameter item response theory model with probit
link is a special case of the model sketched above.
MCMCordfactanal simulates from the posterior distribution using
a Metropolis-Hastings within Gibbs sampling algorithm. The algorithm
employed is based on work by Cowles (1996). Note that
the first element of phi_i is a 1. As a result, the
first column of Lambda can be interpretated as item
difficulty parameters. Further, the first
element gamma_1 is normalized to zero, and thus not
returned in the mcmc object.
The simulation proper is done in compiled C++ code to maximize
efficiency. Please consult the coda documentation for a comprehensive
list of functions that can be used to analyze the posterior
sample.
As is the case with all measurement models, make sure that you have plenty
of free memory, especially when storing the scores.
Value
An mcmc object that contains the posterior sample. This
object can be summarized by functions provided by the coda package.
References
Shawn Treier and Simon Jackman. 2008. “Democracy as a Latent Variable."
American Journal of Political Science. 52: 201-217.
Andrew D. Martin, Kevin M. Quinn, and Jong Hee Park. 2011.
“MCMCpack: Markov Chain Monte Carlo in R.”,
Journal of Statistical Software. 42(9): 1-21.
http://www.jstatsoft.org/v42/i09/.
M. K. Cowles. 1996. “Accelerating Monte Carlo Markov Chain Convergence for
Cumulative-link Generalized Linear Models." Statistics and Computing.
6: 101-110.
Valen E. Johnson and James H. Albert. 1999. “Ordinal Data Modeling."
Springer: New York.
Daniel Pemstein, Kevin M. Quinn, and Andrew D. Martin. 2007.
Scythe Statistical Library 1.0.http://scythe.wustl.edu.