R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Least-squares means (or predicted marginal means)

lsmeans

R Documentation

Least-squares means (or predicted marginal means)

Description

Compute least-squares means (predicted marginal means) for specified factors or factor combinations in a linear model, and optionally comparisons or contrasts among them.

Usage

## S3 method for class 'character'
lsmeans(object, specs, ...)
## (used when 'specs' is 'character')

## S3 method for class 'character.ref.grid'
lsmeans(object, specs, by = NULL, 
    fac.reduce = function(coefs) apply(coefs, 2, mean), contr, 
    options = getOption("lsmeans")$lsmeans, weights, ...)
## (used when 'object' is a 'ref.grid' and 'specs' is 'character')
    
## S3 method for class 'list'
lsmeans(object, specs, ...)
## (used when 'specs' is a 'list')

## S3 method for class 'formula'
lsmeans(object, specs, contr.list, trend, ...)
## (used when 'specs' is a 'formula')

lstrends(model, specs, var, delta.var = 0.01 * rng, data, ...)

lsmobj(bhat, V, levels, linfct, df = NA, post.beta = matrix(NA), ...)

pmmeans(...)
pmtrends(...)
pmmobj(...)

Arguments

`object`	An object of class `ref.grid`; or a fitted model object that is supported, such as the result of a call to `lm` or `lmer`. Many fitted-model objects are supported; see `link{models}` for details.
`specs`	A `character` vector specifying the names of the predictors over which LS-means are desired. `specs` may also be a `formula` or a `list` (optionally named) of valid `spec`s. Use of formulas is described in the Details section below.
`by`	A character vector specifying the names of predictors to condition on.
`fac.reduce`	A function that combines the rows of a matrix into a single vector. This implements the “marginal averaging” aspect of least-squares means. The default is the mean of the rows. Typically if it is overridden, it would be some kind of weighted mean of the rows. If `fac.reduce` is nonlinear, bizarre results are likely, and LS means will not be interpretable. If the `weights` argument is non-missing, `fac.reduce` is ignored.
`contr`	A `list` of contrast coefficients to apply to the least-squares means – or the root name of an `.lsmc` function that returns such coefficients. In addition, `contr = "cld"` is an alternative way to invoke the `cld` function. See `contrast` for more details on contrasts. NOTE: `contr` is ignored when `specs` is a formula.
`contr.list`	A named `list` of `list`s of contrast coefficients, as for `contr`. This is used only in the formula method; see Details below.
`options`	If non-`NULL`, a named `list` of arguments to pass to `update`, just after the object is constructed.
`weights`	Numeric vector, numeric matrix, or character string specifying weights to use in averaging predictions. If a vector, its length must equal the number of predictions to be averaged to obtain each least-squares mean. If a matrix, each row of the matrix is used in turn, wrapping back to the first row as needed. When in doubt about what is being averaged (or how many), first call with `weights = "show.levels"`.) If a string, it should partially match one of the following: `"equal"` Use an equally weighted average. `"proportional"` Weight in proportion to the frequencies (in the original data) of the factor combinations that are averaged over. `"outer"` Weight in proportion to each individual factor's marginal frequencies. Thus, the weights for a combination of factors are the outer product of the one-factor margins `"cells"` Weight according to the frequencies of the cells being averaged. `show.levels` This is a convenience feature for understanding what is being averaged over. Instead of a table of LS means, this causes the function to return a table showing the levels that are averaged over, in the order they appear. Outer weights are like the 'expected' counts in a chi-square test of independence, and will yield the same results as those obtained by proportional averaging with one factor at a time. All except `"cells"` uses the same set of weights for each mean. In a model where the predicted values are the cell means, cell weights will yield the raw averages of the data for the factors involved. Note: If weights were used in fitting the model, then weight totals are used in place of frequencies in these schemes. If `weights` is used, `fac.reduce` is ignored.
`trend`	Including this argument is an alternative way of calling `lstrends` with it as its `var` argument.
`model`	A supported model object.
`var`	Character giving the name of a variable with respect to which a difference quotient of the linear predictors is computed. In order for this to be useful, `var` should be a numeric predictor that interacts with at least one factor in `specs`. Then instead of computing least-squares means, we compute and compare the slopes of the `var` trend over levels of the specified other predictor(s). As in least-squares means, marginal averages are computed when some variables in the reference grid are excluded for the specification. The user may specify some monotone function of one variable, e.g., `var = "log(dose)"`. If so, the chain rule is applied. Note that, in this example, if `model` contains `log(dose)` as a predictor, we will be comparing the slopes estimated by that model, whereas specifying `var = "dose"` would perform a transformation of those slopes.
`delta.var`	The value of h to use in forming the difference quotient (f(x+h) - f(x))/h. Changing it (especially changing its sign) may be necessary to avoid numerical problems such as logs of negative numbers. The default value is 1/100 of the range of `var` over the dataset.
`data`	As in `ref.grid`, you may use this argument to supply the dataset used in fitting the model, for situations where it is not possible to reconstruct the data. Otherwise, leave it missing.
`bhat`	Numeric. Vector of regression coefficients.
`V`	Square matrix. Covariance matrix of `bhat`
`levels`	Named list or vector. Levels of factor(s) that define the estimates defined by `linfct`. If not a list, we assume one factor named `"level"`
`linfct`	Matrix. Linear functions of `bhat` for each combination of `levels`
`df`	Numeric or function with arguments `x,dfargs)`. If a number, that is used for the degrees of freedom. If a function, it should return the degrees of freedom for `sum(x*bhat)`; if additional parameters are needed, include them in `...` as `dfargs` (not abbreviated).
`post.beta`	Matrix whose columns comprise a sample from the posterior distribution of the regression coefficients (so that typically, the column averages will be `bhat`). A 1 x 1 matrix of `NA` indicates that such a sample is unavailable.
`...`	Additional arguments passed to other methods or to `ref.grid`. For example, `vcov.` may be used to override the default covariance estimate, and some models allow additional options. Some models require `data` to be given explicitly. See the help pages for `ref.grid` and models. In addition, if the model formula contains references to variables that are not predictors, you must provide a `params` argument with a list of their names; see the example below for `Oatsq.lm`.

Details

Least-squares means (also called predicted marginal means) are predictions from a linear model over a reference grid, or marginal averages thereof. They have been popularized by SAS (SAS Institute, 2012). The ref.grid function identifies/creates the reference grid upon which lsmeans is based.

For those who prefer the term “predicted marginal means”, courtesy wrappers pmmeans, pmtrends, and pmmobj are provided that behave identically to those that start with ls, except that estimates are relabeled accordingly (e.g., lsmean becomes pmmean).

If specs is a formula, it should be of the form contr ~ specs | by. The formula is parsed and then used as the arguments contr, specs, and by as indicated. The left-hand side is optional, but if specified it should be the name of a contrast family (e.g., pairwise) or of a sub-list of contr.list. Operators like * or : are necessary to delineate names in the formulas, but otherwise are ignored.

A number of standard contrast families are provided. They can be identified as functions having names ending in .lsmc – use

ls("package:lsmeans", pat=".lsmc")

to list them. See the documentation for pairwise.lsmc and its siblings for details. You may write your own .lsmc function for custom contrasts.

The function lsmobj may be used to construct an object just like one returned by lsmeans from user-specified coefficients, covariance matrix, levels (or row labels), linear functions for each row, and degrees of freedom. After the object is constructed, it is updateed with any additional arguments in ....

Value

When specs is a character vector or one-sided formula, an object of class lsmobj. A number of methods are provided for further analysis, including summary, confint, test, contrast, pairs, and cld.

When specs is a list or a formula having a left-hand side, the eturn value is an lsm.list object, which is simply a list of lsmobj objects. Methods for lsm.list objects are the same as those for lsmobj, but they apply to only one member of the list, determined by its which argument.

Side effect: When object is a model, a reference grid is constructed and it is saved as .Last.ref.grid in the user's enironment (unless this is disabled via lsm.option(save.ref.grid = FALSE)). This makes it possible to check what reference grid was used, or to use it as the object in future lsmeans calls (and bypass reconstructing it). Similarly, lstrends also saves its reference grid (but for predicting difference quotients) in the same variable.

Note

If the model formula contains variables that are not predictors (e.g., degree of a polynomial, knots for a spline, etc.), you must add a params argument to the call

Note

While using specs as a two-sided formula or a list is a convenient way to get a lot of results with minimal effort, it can also create confusion when additional arguments are provided, because not all arguments may be applied to all the results produced (see examples). Thus, the safer route is to do things incrementally.

Note

lsmeans and its relatives can produce fatal errors or incorrect results with models containing splines (e.g., ns) and other smoothers because the required information to reconstruct their basis is not always available. A model with poly involving two or more predictors will almost always produce misleading results without any warning; but poly(..., raw = TRUE) will work correctly.

Note

For a ref.grid or lsmobj object created in lsmeans version 2.10 or earlier, the information needed by the weights argument is not present; so a message is displayed and averaging is done using fac.reduce.

Author(s)

Russell V. Lenth

References

SAS Institute Inc. (2012) Online documentation; Shared concepts; LSMEANS statement, http://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#statug_introcom_a0000003362.htm, accessed August 15, 2012.

Examples

require(lsmeans)

### Covariance example (from Montgomery Design (8th ed.), p.656)
# Uses supplied dataset 'fiber'
fiber.lm <- lm(strength ~ diameter + machine, data = fiber)

# adjusted means and comparisons, treating machine C as control
( fiber.lsm <- lsmeans (fiber.lm, "machine") )
contrast(fiber.lsm, "trt.vs.ctrlk")
# Or get both at once using
#     lsmeans (fiber.lm, "machine", contr = "trt.vs.ctrlk")


### Factorial experiment
warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks)
( warp.lsm <- lsmeans (warp.lm,  ~ wool | tension,
    options = list(estName = "pred.breaks")) )
pairs(warp.lsm) # remembers 'by' structure
contrast(warp.lsm, method = "poly", by = "wool")


### Unbalanced split-plot example ###
#-- The imbalance is imposed deliberately to illustrate that
#-- the variance estimates become biased
require(nlme)
Oats.lme <- lme(yield ~ factor(nitro) + Variety, 
    random = ~1 | Block/Variety, 
    subset = -c(1,2,3,5,8,13,21,34,55), data = Oats)
(Oats.anal <- lsmeans(Oats.lme, list(poly ~ nitro, pairwise ~ Variety)))


### Issues with lists of specs
test(Oats.anal)                # Uses 1st element by default
confint(Oats.anal, which = 4)  # or confint(Oats.anal[[4]])

# Using 'pmmeans' wrapper ...
pmmeans(warp.lm, ~ wool, 
    options = list(infer = c(TRUE, TRUE), null = 22, side = ">"))

### Weights
# See what's being averaged over in the above
lsmeans(Oats.lme, ~ nitro, cov.reduce = FALSE, weights = "show.levels")

# Give three times the weight to Marvellous
lsmeans(Oats.lme, ~ nitro, cov.reduce = FALSE, weights = c(1,3,1))


### Model with a quadratic trend for 'nitro'
# Also illustrates use of `params` argument to list non-predictors
deg = 2
Oatsq.lm <- lm(yield ~ Block + poly(nitro, deg) + Variety, data = Oats)
# Predictions at each unique 'nitro' value in the dataset
lsmeans(Oatsq.lm, ~ nitro, cov.reduce = FALSE, params = "deg")


### Trends
fiber.lm <- lm(strength ~ diameter*machine, data=fiber)
# Obtain slopes for each machine ...
( fiber.lst <- lstrends(fiber.lm, "machine", var="diameter") )
# ... and pairwise comparisons thereof
pairs(fiber.lst)

# Suppose we want trends relative to sqrt(diameter)...
lstrends(fiber.lm, ~ machine | diameter, var = "sqrt(diameter)", 
    at = list(diameter = c(20,30)))

# Given summary statistics for 4 cities computed elsewhere, 
# obtain multiple comparisons of their means using the
# Satterthwaite method
ybar <- c(47.6, 53.2, 88.9, 69.8)
s <-    c(12.1, 19.5, 22.8, 13.2)
n <-    c(44,   11,   37,   24)
se2 = s^2 / n
Satt.df <- function(x, dfargs)
    sum(x * dfargs$v)^2 / sum((x * dfargs$v)^2 / (dfargs$n - 1))
city.pmm <- pmmobj(bhat = ybar, V = diag(se2),
    levels = list(city = LETTERS[1:4]), linfct = diag(c(1,1,1,1)),
    df = Satt.df, dfargs = list(v = se2, n = n), estName = "mean")
city.pmm
contrast(city.pmm, "revpairwise")
    
    
# See also many other examples in documentation for 
# 'contrast', 'cld', 'glht', 'lsmip', 'ref.grid', 'MOats',
# 'nutrition', etc., and in the vignettes

Least-squares means (or predicted marginal means)

Description

Usage

Arguments

Details

Value

Note

Note

Note

Note

Author(s)

References

See Also

Examples

Results