R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Fit Structural Equation Models

sem	R Documentation

Fit Structural Equation Models

Description

Fit a Structural Equation Model (SEM).

Usage

sem(model = NULL, data = NULL,
    meanstructure = "default", 
    conditional.x = "default", fixed.x = "default",
    orthogonal = FALSE, std.lv = FALSE,
    parameterization = "default", std.ov = FALSE,
    missing = "default", ordered = NULL, 
    sample.cov = NULL, sample.cov.rescale = "default",
    sample.mean = NULL, sample.nobs = NULL, 
    ridge = 1e-05, group = NULL, 
    group.label = NULL, group.equal = "", group.partial = "", 
    group.w.free = FALSE, cluster = NULL, constraints = '', 
    estimator = "default", likelihood = "default", link = "default",
    information = "default", se = "default", test = "default",
    bootstrap = 1000L, mimic = "default", representation = "default", 
    do.fit = TRUE, control = list(), WLS.V = NULL, NACOV = NULL,
    zero.add = "default", zero.keep.margins = "default",
    zero.cell.warn = TRUE,
    start = "default", verbose = FALSE, warn = TRUE, debug = FALSE)

Arguments

`model`	A description of the user-specified model. Typically, the model is described using the lavaan model syntax. See `model.syntax` for more information. Alternatively, a parameter table (eg. the output of the `lavaanify()` function) is also accepted.
`data`	An optional data frame containing the observed variables used in the model. If some variables are declared as ordered factors, lavaan will treat them as ordinal variables.
`meanstructure`	If `TRUE`, the means of the observed variables enter the model. If `"default"`, the value is set based on the user-specified model, and/or the values of other arguments.
`conditional.x`	If `TRUE`, we set up the model conditional on the exogenous ‘x’ covariates; the model-implied sample statistics only include the non-x variables. If `FALSE`, the exogenous ‘x’ variables are modeled jointly with the other variables, and the model-implied statistics refect both sets of variables. If `"default"`, the value is set depending on the estimator, and whether or not the model involves categorical endogenous variables.
`fixed.x`	If `TRUE`, the exogenous ‘x’ covariates are considered fixed variables and the means, variances and covariances of these variables are fixed to their sample values. If `FALSE`, they are considered random, and the means, variances and covariances are free parameters. If `"default"`, the value is set depending on the mimic option.
`orthogonal`	If `TRUE`, the exogenous latent variables are assumed to be uncorrelated.
`std.lv`	If `TRUE`, the metric of each latent variable is determined by fixing their (residual) variances to 1.0. If `FALSE`, the metric of each latent variable is determined by fixing the factor loading of the first indicator to 1.0.
`parameterization`	Currently only used if data is categorical. If `"delta"`, the delta parameterization is used. If `"theta"`, the theta parameterization is used.
`std.ov`	If `TRUE`, all observed variables are standardized before entering the analysis.
`missing`	If `"listwise"`, cases with missing values are removed listwise from the data frame before analysis. If `"direct"` or `"ml"` or `"fiml"` and the estimator is maximum likelihood, Full Information Maximum Likelihood (FIML) estimation is used using all available data in the data frame. This is only valid if the data are missing completely at random (MCAR) or missing at random (MAR). If `"default"`, the value is set depending on the estimator and the mimic option.
`ordered`	Character vector. Only used if the data is in a data.frame. Treat these variables as ordered (ordinal) variables, if they are endogenous in the model. Importantly, all other variables will be treated as numeric (unless they are declared as ordered in the original data.frame.)
`sample.cov`	Numeric matrix. A sample variance-covariance matrix. The rownames and/or colnames must contain the observed variable names. For a multiple group analysis, a list with a variance-covariance matrix for each group. Note that if maximum likelihood estimation is used and `likelihood="normal"`, the user provided covariance matrix is internally rescaled by multiplying it with a factor (N-1)/N, to ensure that the covariance matrix has been divided by N. This can be turned off by setting the `sample.cov.rescale` argument to `FALSE`.
`sample.cov.rescale`	If `TRUE`, the sample covariance matrix provided by the user is internally rescaled by multiplying it with a factor (N-1)/N. If `"default"`, the value is set depending on the estimator and the likelihood option: it is set to `TRUE` if maximum likelihood estimation is used and `likelihood="normal"`, and `FALSE` otherwise.
`sample.mean`	A sample mean vector. For a multiple group analysis, a list with a mean vector for each group.
`sample.nobs`	Number of observations if the full data frame is missing and only sample moments are given. For a multiple group analysis, a list or a vector with the number of observations for each group.
`ridge`	Numeric. Small constant used for ridging. Only used if the sample covariance matrix is non positive definite.
`group`	A variable name in the data frame defining the groups in a multiple group analysis.
`group.label`	A character vector. The user can specify which group (or factor) levels need to be selected from the grouping variable, and in which order. If `NULL` (the default), all grouping levels are selected, in the order as they appear in the data.
`group.equal`	A vector of character strings. Only used in a multiple group analysis. Can be one or more of the following: `"loadings"`, `"intercepts"`, `"means"`, `"thresholds"`, `"regressions"`, `"residuals"`, `"residual.covariances"`, `"lv.variances"` or `"lv.covariances"`, specifying the pattern of equality constraints across multiple groups.
`group.partial`	A vector of character strings containing the labels of the parameters which should be free in all groups (thereby overriding the group.equal argument for some specific parameters).
`group.w.free`	Logical. If `TRUE`, the group frequencies are considered to be free parameters in the model. In this case, a Poisson model is fitted to estimate the group frequencies. If `FALSE` (the default), the group frequencies are fixed to their observed values.
`cluster`	Not used yet.
`constraints`	Additional (in)equality constraints not yet included in the model syntax. See `model.syntax` for more information.
`estimator`	The estimator to be used. Can be one of the following: `"ML"` for maximum likelihood, `"GLS"` for generalized least squares, `"WLS"` for weighted least squares (sometimes called ADF estimation), `"ULS"` for unweighted least squares and `"DWLS"` for diagonally weighted least squares. These are the main options that affect the estimation. For convenience, the `"ML"` option can be extended as `"MLM"`, `"MLMV"`, `"MLMVS"`, `"MLF"`, and `"MLR"`. The estimation will still be plain `"ML"`, but now with robust standard errors and a robust (scaled) test statistic. For `"MLM"`, `"MLMV"`, `"MLMVS"`, classic robust standard errors are used (`se="robust.sem"`); for `"MLF"`, standard errors are based on first-order derivatives (`se="first.order"`); for `"MLR"`, ‘Huber-White’ robust standard errors are used (`se="robust.huber.white"`). In addition, `"MLM"` will compute a Satorra-Bentler scaled (mean adjusted) test statistic (`test="satorra.bentler"`) , `"MLMVS"` will compute a mean and variance adjusted test statistic (Satterthwaite style) (`test="mean.var.adjusted"`), `"MLMV"` will compute a mean and variance adjusted test statistic (scaled and shifted) (`test="scaled.shifted"`), and `"MLR"` will compute a test statistic which is asymptotically equivalent to the Yuan-Bentler T2-star test statistic. Analogously, the estimators `"WLSM"` and `"WLSMV"` imply the `"DWLS"` estimator (not the `"WLS"` estimator) with robust standard errors and a mean or mean and variance adjusted test statistic. Estimators `"ULSM"` and `"ULSMV"` imply the `"ULS"` estimator with robust standard errors and a mean or mean and variance adjusted test statistic.
`likelihood`	Only relevant for ML estimation. If `"wishart"`, the wishart likelihood approach is used. In this approach, the covariance matrix has been divided by N-1, and both standard errors and test statistics are based on N-1. If `"normal"`, the normal likelihood approach is used. Here, the covariance matrix has been divided by N, and both standard errors and test statistics are based on N. If `"default"`, it depends on the mimic option: if `mimic="lavaan"` or `mimic="Mplus"`, normal likelihood is used; otherwise, wishart likelihood is used.
`link`	Currently only used if estimator is MML. If `"logit"`, a logit link is used for binary and ordered observed variables. If `"probit"`, a probit link is used. If `"default"`, it is currently set to `"probit"` (but this may change).
`information`	If `"expected"`, the expected information matrix is used (to compute the standard errors). If `"observed"`, the observed information matrix is used. If `"default"`, the value is set depending on the estimator and the mimic option.
`se`	If `"standard"`, conventional standard errors are computed based on inverting the (expected or observed) information matrix. If `"first.order"`, standard errors are computed based on first-order derivatives. If `"robust.sem"`, conventional robust standard errors are computed. If `"robust.huber.white"`, standard errors are computed based on the ‘mlr’ (aka pseudo ML, Huber-White) approach. If `"robust"`, either `"robust.sem"` or `"robust.huber.white"` is used depending on the estimator, the mimic option, and whether the data are complete or not. If `"boot"` or `"bootstrap"`, bootstrap standard errors are computed using standard bootstrapping (unless Bollen-Stine bootstrapping is requested for the test statistic; in this case bootstrap standard errors are computed using model-based bootstrapping). If `"none"`, no standard errors are computed.
`test`	If `"standard"`, a conventional chi-square test is computed. If `"Satorra.Bentler"`, a Satorra-Bentler scaled test statistic is computed. If `"Yuan.Bentler"`, a Yuan-Bentler scaled test statistic is computed. If `"mean.var.adjusted"` or `"Satterthwaite"`, a mean and variance adjusted test statistic is compute. If `"scaled.shifted"`, an alternative mean and variance adjusted test statistic is computed (as in Mplus version 6 or higher). If `"boot"` or `"bootstrap"` or `"Bollen.Stine"`, the Bollen-Stine bootstrap is used to compute the bootstrap probability value of the test statistic. If `"default"`, the value depends on the values of other arguments.
`bootstrap`	Number of bootstrap draws, if bootstrapping is used.
`mimic`	If `"Mplus"`, an attempt is made to mimic the Mplus program. If `"EQS"`, an attempt is made to mimic the EQS program. If `"default"`, the value is (currently) set to to `"lavaan"`, which is very close to`"Mplus"`.
`representation`	If `"LISREL"` the classical LISREL matrix representation is used to represent the model (using the all-y variant).
`do.fit`	If `FALSE`, the model is not fit, and the current starting values of the model parameters are preserved.
`control`	A list containing control parameters passed to the optimizer. By default, lavaan uses `"nlminb"`. See the manpage of `nlminb` for an overview of the control parameters. A different optimizer can be chosen by setting the value of `optim.method`. For unconstrained optimization (the model syntax does not include any "==", ">" or "<" operators), the available options are `"nlminb"` (the default), `"BFGS"` and `"L-BFGS-B"`. See the manpage of the `optim` function for the control parameters of the latter two options. For constrained optimization, the only available option is `"nlminb.constr"`.
`WLS.V`	A user provided weight matrix to be used by estimator `"WLS"`; if the estimator is `"DWLS"`, only the diagonal of this matrix will be used. For a multiple group analysis, a list with a weight matrix for each group. The elements of the weight matrix should be in the following order (if all data is continuous): first the means (if a meanstructure is involved), then the lower triangular elements of the covariance matrix including the diagonal, ordered column by column. In the categorical case: first the thresholds (including the means for continuous variables), then the slopes (if any), the variances of continuous variables (if any), and finally the lower triangular elements of the correlation/covariance matrix excluding the diagonal, ordered column by column.
`NACOV`	A user provided matrix containing the elements of (N times) the asymptotic variance-covariance matrix of the sample statistics. For a multiple group analysis, a list with an asymptotic variance-covariance matrix for each group. See the `WLS.V` argument for information about the order of the elements.
`zero.add`	A numeric vector containing two values. These values affect the calculation of polychoric correlations when some frequencies in the bivariate table are zero. The first value only applies for 2x2 tables. The second value for larger tables. This value is added to the zero frequency in the bivariate table. If `"default"`, the value is set depending on the `"mimic"` option. By default, lavaan uses `zero.add = c(0.5. 0.0)`.
`zero.keep.margins`	Logical. This argument only affects the computation of polychoric correlations for 2x2 tables with an empty cell, and where a value is added to the empty cell. If `TRUE`, the other values of the frequency table are adjusted so that all margins are unaffected. If `"default"`, the value is set depending on the `"mimic"`. The default is `TRUE`.
`zero.cell.warn`	Logical. Only used if some observed endogenous variables are categorical. If `TRUE`, give a warning if one or more cells of a bivariate frequency table are empty.
`start`	If it is a character string, the two options are currently `"simple"` and `"Mplus"`. In the first case, all parameter values are set to zero, except the factor loadings (set to one), the variances of latent variables (set to 0.05), and the residual variances of observed variables (set to half the observed variance). If `"Mplus"`, we use a similar scheme, but the factor loadings are estimated using the fabin3 estimator (tsls) per factor. If `start` is a fitted object of class `lavaan`, the estimated values of the corresponding parameters will be extracted. If it is a model list, for example the output of the `paramaterEstimates()` function, the values of the `est` or `start` or `ustart` column (whichever is found first) will be extracted.
`verbose`	If `TRUE`, the function value is printed out during each iteration.
`warn`	If `TRUE`, some (possibly harmless) warnings are printed out during the iterations.
`debug`	If `TRUE`, debugging information is printed out.

Details

The sem function is a wrapper for the more general lavaan function, using the following default arguments: int.ov.free = TRUE, int.lv.free = FALSE, auto.fix.first = TRUE (unless std.lv = TRUE), auto.fix.single = TRUE, auto.var = TRUE, auto.cov.lv.x = TRUE, auto.th = TRUE, auto.delta = TRUE, and auto.cov.y = TRUE.

Value

An object of class lavaan, for which several methods are available, including a summary method.

References

Yves Rosseel (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1-36. URL http://www.jstatsoft.org/v48/i02/.

Examples

## The industrialization and Political Democracy Example 
## Bollen (1989), page 332
model <- ' 
  # latent variable definitions
     ind60 =~ x1 + x2 + x3
     dem60 =~ y1 + a*y2 + b*y3 + c*y4
     dem65 =~ y5 + a*y6 + b*y7 + c*y8

  # regressions
    dem60 ~ ind60
    dem65 ~ ind60 + dem60

  # residual correlations
    y1 ~~ y5
    y2 ~~ y4 + y6
    y3 ~~ y7
    y4 ~~ y8
    y6 ~~ y8
'

fit <- sem(model, data=PoliticalDemocracy)
summary(fit, fit.measures=TRUE)