A description of the user-specified model. Typically, the model
is described using the lavaan model syntax. See
model.syntax for more information. Alternatively, a
parameter table (eg. the output of the lavaanify() function) is also
accepted.
data
An optional data frame containing the observed variables used in
the model. If some variables are declared as ordered factors, lavaan will
treat them as ordinal variables.
meanstructure
If TRUE, the means of the observed
variables enter the model. If "default", the value is set based
on the user-specified model, and/or the values of other arguments.
conditional.x
If TRUE, we set up the model conditional on
the exogenous ‘x’ covariates; the model-implied sample statistics
only include the non-x variables. If FALSE, the exogenous ‘x’
variables are modeled jointly with the other variables, and the
model-implied statistics refect both sets of variables. If
"default", the value is set depending on the estimator, and
whether or not the model involves categorical endogenous variables.
fixed.x
If TRUE, the exogenous ‘x’ covariates are considered
fixed variables and the means, variances and covariances of these variables
are fixed to their sample values. If FALSE, they are considered
random, and the means, variances and covariances are free parameters. If
"default", the value is set depending on the mimic option.
orthogonal
If TRUE, the exogenous latent variables
are assumed to be uncorrelated.
std.lv
If TRUE, the metric of each latent variable is
determined by fixing their (residual)
variances to 1.0. If FALSE, the metric
of each latent variable is determined by fixing the factor loading of the
first indicator to 1.0.
parameterization
Currently only used if data is categorical. If
"delta", the delta parameterization is used. If "theta",
the theta parameterization is used.
std.ov
If TRUE, all observed variables are standardized
before entering the analysis.
missing
If "listwise", cases with missing values are removed
listwise from the data frame before analysis. If "direct" or
"ml" or "fiml" and the estimator is maximum likelihood,
Full Information Maximum Likelihood (FIML) estimation is used using all
available data in the data frame. This is only valid if the data are
missing completely at random (MCAR) or missing at random (MAR). If
"default", the value is set depending on the estimator and the
mimic option.
ordered
Character vector. Only used if the data is in a data.frame.
Treat these variables as ordered (ordinal) variables, if they are
endogenous in the model. Importantly, all other variables will be treated
as numeric (unless they are declared as ordered in the original data.frame.)
sample.cov
Numeric matrix. A sample variance-covariance matrix.
The rownames and/or colnames must contain the observed variable names.
For a multiple group analysis, a list with a variance-covariance matrix
for each group. Note that if maximum likelihood estimation is used and
likelihood="normal", the user provided covariance matrix is
internally rescaled by multiplying it with a factor (N-1)/N, to ensure
that the covariance matrix has been divided by N. This can be turned off
by setting the sample.cov.rescale argument to FALSE.
sample.cov.rescale
If TRUE, the sample covariance matrix provided
by the user is internally rescaled by multiplying it with a factor (N-1)/N.
If "default", the value is set depending on the estimator and the
likelihood option: it is set to TRUE if maximum likelihood
estimation is used and likelihood="normal", and FALSE
otherwise.
sample.mean
A sample mean vector. For a multiple group analysis,
a list with a mean vector for each group.
sample.nobs
Number of observations if the full data frame is missing
and only sample moments are given. For a multiple group analysis, a list
or a vector with the number of observations for each group.
ridge
Numeric. Small constant used for ridging. Only used if the sample covariance matrix is non positive definite.
group
A variable name in the data frame defining the groups in a
multiple group analysis.
group.label
A character vector. The user can specify which group (or
factor) levels need to be selected from the grouping variable, and in which
order. If NULL (the default), all grouping levels are selected, in the
order as they appear in the data.
group.equal
A vector of character strings. Only used in
a multiple group analysis. Can be one or more of the following:
"loadings", "intercepts", "means", "thresholds",
"regressions", "residuals",
"residual.covariances", "lv.variances" or
"lv.covariances", specifying the pattern of equality
constraints across multiple groups.
group.partial
A vector of character strings containing the labels
of the parameters which should be free in all groups (thereby
overriding the group.equal argument for some specific parameters).
group.w.free
Logical. If TRUE, the group frequencies are
considered to be free parameters in the model. In this case, a
Poisson model is fitted to estimate the group frequencies. If
FALSE (the default), the group frequencies are fixed to their
observed values.
cluster
Not used yet.
constraints
Additional (in)equality constraints not yet included in the
model syntax. See model.syntax for more information.
estimator
The estimator to be used. Can be one of the following:
"ML" for maximum likelihood, "GLS" for generalized least
squares, "WLS" for weighted least squares (sometimes called ADF
estimation), "ULS" for unweighted least squares and "DWLS" for
diagonally weighted least squares. These are the main options that affect
the estimation. For convenience, the "ML" option can be extended
as "MLM", "MLMV", "MLMVS", "MLF", and
"MLR". The estimation will still be plain "ML", but now
with robust standard errors and a robust (scaled) test statistic. For
"MLM", "MLMV", "MLMVS", classic robust standard
errors are used (se="robust.sem"); for "MLF", standard
errors are based on first-order derivatives (se="first.order");
for "MLR", ‘Huber-White’ robust standard errors are used
(se="robust.huber.white"). In addition, "MLM" will compute
a Satorra-Bentler scaled (mean adjusted) test statistic
(test="satorra.bentler") , "MLMVS" will compute a
mean and variance adjusted test statistic (Satterthwaite style)
(test="mean.var.adjusted"), "MLMV" will compute a mean
and variance adjusted test statistic (scaled and shifted)
(test="scaled.shifted"), and "MLR" will
compute a test statistic which is asymptotically
equivalent to the Yuan-Bentler T2-star test statistic. Analogously,
the estimators "WLSM" and "WLSMV" imply the "DWLS"
estimator (not the "WLS" estimator) with robust standard errors
and a mean or mean and variance adjusted test statistic. Estimators
"ULSM" and "ULSMV" imply the "ULS"
estimator with robust standard errors
and a mean or mean and variance adjusted test statistic.
likelihood
Only relevant for ML estimation. If "wishart",
the wishart likelihood approach is used. In this approach, the covariance
matrix has been divided by N-1, and both standard errors and test
statistics are based on N-1.
If "normal", the normal likelihood approach is used. Here,
the covariance matrix has been divided by N, and both standard errors
and test statistics are based on N. If "default", it depends
on the mimic option: if mimic="lavaan" or mimic="Mplus",
normal likelihood is used; otherwise, wishart likelihood is used.
link
Currently only used if estimator is MML. If "logit",
a logit link is used for binary and ordered observed variables.
If "probit", a probit link is used. If "default",
it is currently set to "probit" (but this may change).
information
If "expected", the expected information matrix
is used (to compute the standard errors). If "observed", the
observed information matrix is used. If "default", the value is
set depending on the estimator and the mimic option.
se
If "standard", conventional standard errors
are computed based on inverting the (expected or observed) information
matrix. If "first.order", standard errors are computed based on
first-order derivatives. If "robust.sem", conventional robust
standard errors are computed. If "robust.huber.white",
standard errors are computed based on the ‘mlr’ (aka pseudo ML,
Huber-White) approach.
If "robust", either "robust.sem" or
"robust.huber.white" is used depending on the estimator,
the mimic option, and whether the data are complete or not.
If "boot" or "bootstrap", bootstrap standard errors are
computed using standard bootstrapping (unless Bollen-Stine bootstrapping
is requested for the test statistic; in this case bootstrap standard
errors are computed using model-based bootstrapping).
If "none", no standard errors are computed.
test
If "standard", a conventional chi-square test is computed.
If "Satorra.Bentler", a Satorra-Bentler scaled test statistic is
computed. If "Yuan.Bentler", a Yuan-Bentler scaled test statistic
is computed. If "mean.var.adjusted" or "Satterthwaite", a
mean and variance adjusted test statistic is compute.
If "scaled.shifted", an alternative mean and variance adjusted test
statistic is computed (as in Mplus version 6 or higher).
If "boot" or "bootstrap" or
"Bollen.Stine", the Bollen-Stine bootstrap is used to compute
the bootstrap probability value of the test statistic.
If "default", the value depends on the
values of other arguments.
bootstrap
Number of bootstrap draws, if bootstrapping is used.
mimic
If "Mplus", an attempt is made to mimic the Mplus
program. If "EQS", an attempt is made to mimic the EQS program.
If "default", the value is (currently) set to to "lavaan",
which is very close to"Mplus".
representation
If "LISREL" the classical LISREL matrix
representation is used to represent the model (using the all-y variant).
do.fit
If FALSE, the model is not fit, and the current
starting values of the model parameters are preserved.
control
A list containing control parameters passed to the optimizer.
By default, lavaan uses "nlminb". See the manpage of
nlminb for an overview of the control parameters.
A different optimizer can be chosen by setting the value of
optim.method. For unconstrained optimization (the model syntax
does not include any "==", ">" or "<" operators),
the available options are "nlminb" (the default), "BFGS" and
"L-BFGS-B". See the manpage of the optim function for
the control parameters of the latter two options. For constrained
optimization, the only available option is "nlminb.constr".
WLS.V
A user provided weight matrix to be used by estimator "WLS";
if the estimator is "DWLS", only the diagonal of this matrix will be
used. For a multiple group analysis, a list with a weight matrix
for each group. The elements of the weight matrix should be in the
following order (if all data is continuous): first the means (if a
meanstructure is involved), then the lower triangular elements of the
covariance matrix including the diagonal, ordered column by column. In
the categorical case: first the thresholds (including the means for
continuous variables), then the slopes (if any), the variances of
continuous variables (if any), and finally the lower triangular elements
of the correlation/covariance matrix excluding the diagonal, ordered
column by column.
NACOV
A user provided matrix containing the elements of (N times)
the asymptotic variance-covariance matrix of the sample statistics.
For a multiple group analysis, a list with an asymptotic
variance-covariance matrix for each group. See the WLS.V
argument for information about the order of the elements.
zero.add
A numeric vector containing two values. These values affect the
calculation of polychoric correlations when some frequencies in the
bivariate table are zero.
The first value only applies for 2x2 tables. The second value for larger
tables. This value is added to the zero frequency in the bivariate table.
If "default", the value is set depending on the "mimic"
option. By default, lavaan uses zero.add = c(0.5. 0.0).
zero.keep.margins
Logical. This argument only affects the computation
of polychoric correlations for 2x2 tables with an empty cell, and where a
value is added to the empty cell. If TRUE, the other values of the
frequency table are adjusted so that all margins are unaffected. If
"default", the value is set depending on the "mimic". The
default is TRUE.
zero.cell.warn
Logical. Only used if some observed endogenous variables
are categorical. If TRUE, give a warning if one or more cells
of a bivariate frequency table are empty.
start
If it is a character string,
the two options are currently "simple" and "Mplus".
In the first
case, all parameter values are set to zero, except the factor loadings
(set to one), the variances of latent variables (set to 0.05), and
the residual variances of observed variables (set to half the observed
variance).
If "Mplus", we use a similar scheme, but the factor loadings are
estimated using the fabin3 estimator (tsls) per factor.
If start is a fitted
object of class lavaan, the estimated values of
the corresponding parameters will be extracted. If it is a model list,
for example the output of the paramaterEstimates() function,
the values of the est or start or ustart column
(whichever is found first) will be extracted.
verbose
If TRUE, the function value is printed out during
each iteration.
warn
If TRUE, some (possibly harmless) warnings are printed
out during the iterations.
debug
If TRUE, debugging information is printed out.
Details
The sem function is a wrapper for the more general
lavaan function, using the following default arguments:
int.ov.free = TRUE, int.lv.free = FALSE,
auto.fix.first = TRUE (unless std.lv = TRUE),
auto.fix.single = TRUE, auto.var = TRUE,
auto.cov.lv.x = TRUE,
auto.th = TRUE, auto.delta = TRUE,
and auto.cov.y = TRUE.
Value
An object of class lavaan, for which several methods
are available, including a summary method.
References
Yves Rosseel (2012). lavaan: An R Package for Structural Equation
Modeling. Journal of Statistical Software, 48(2), 1-36. URL
http://www.jstatsoft.org/v48/i02/.