Last data update: 2014.03.03

R: Least-squares means (or predicted marginal means)
lsmeansR Documentation

Least-squares means (or predicted marginal means)

Description

Compute least-squares means (predicted marginal means) for specified factors or factor combinations in a linear model, and optionally comparisons or contrasts among them.

Usage

## S3 method for class 'character'
lsmeans(object, specs, ...)
## (used when 'specs' is 'character')

## S3 method for class 'character.ref.grid'
lsmeans(object, specs, by = NULL, 
    fac.reduce = function(coefs) apply(coefs, 2, mean), contr, 
    options = getOption("lsmeans")$lsmeans, weights, ...)
## (used when 'object' is a 'ref.grid' and 'specs' is 'character')
    
## S3 method for class 'list'
lsmeans(object, specs, ...)
## (used when 'specs' is a 'list')

## S3 method for class 'formula'
lsmeans(object, specs, contr.list, trend, ...)
## (used when 'specs' is a 'formula')

lstrends(model, specs, var, delta.var = 0.01 * rng, data, ...)

lsmobj(bhat, V, levels, linfct, df = NA, post.beta = matrix(NA), ...)

pmmeans(...)
pmtrends(...)
pmmobj(...)

Arguments

object

An object of class ref.grid; or a fitted model object that is supported, such as the result of a call to lm or lmer. Many fitted-model objects are supported; see link{models} for details.

specs

A character vector specifying the names of the predictors over which LS-means are desired. specs may also be a formula or a list (optionally named) of valid specs. Use of formulas is described in the Details section below.

by

A character vector specifying the names of predictors to condition on.

fac.reduce

A function that combines the rows of a matrix into a single vector. This implements the “marginal averaging” aspect of least-squares means. The default is the mean of the rows. Typically if it is overridden, it would be some kind of weighted mean of the rows. If fac.reduce is nonlinear, bizarre results are likely, and LS means will not be interpretable. If the weights argument is non-missing, fac.reduce is ignored.

contr

A list of contrast coefficients to apply to the least-squares means – or the root name of an .lsmc function that returns such coefficients. In addition, contr = "cld" is an alternative way to invoke the cld function. See contrast for more details on contrasts. NOTE: contr is ignored when specs is a formula.

contr.list

A named list of lists of contrast coefficients, as for contr. This is used only in the formula method; see Details below.

options

If non-NULL, a named list of arguments to pass to update, just after the object is constructed.

weights

Numeric vector, numeric matrix, or character string specifying weights to use in averaging predictions. If a vector, its length must equal the number of predictions to be averaged to obtain each least-squares mean. If a matrix, each row of the matrix is used in turn, wrapping back to the first row as needed. When in doubt about what is being averaged (or how many), first call with weights = "show.levels".)

If a string, it should partially match one of the following:

"equal"

Use an equally weighted average.

"proportional"

Weight in proportion to the frequencies (in the original data) of the factor combinations that are averaged over.

"outer"

Weight in proportion to each individual factor's marginal frequencies. Thus, the weights for a combination of factors are the outer product of the one-factor margins

"cells"

Weight according to the frequencies of the cells being averaged.

show.levels

This is a convenience feature for understanding what is being averaged over. Instead of a table of LS means, this causes the function to return a table showing the levels that are averaged over, in the order they appear.

Outer weights are like the 'expected' counts in a chi-square test of independence, and will yield the same results as those obtained by proportional averaging with one factor at a time. All except "cells" uses the same set of weights for each mean. In a model where the predicted values are the cell means, cell weights will yield the raw averages of the data for the factors involved. Note: If weights were used in fitting the model, then weight totals are used in place of frequencies in these schemes.

If weights is used, fac.reduce is ignored.

trend

Including this argument is an alternative way of calling lstrends with it as its var argument.

model

A supported model object.

var

Character giving the name of a variable with respect to which a difference quotient of the linear predictors is computed. In order for this to be useful, var should be a numeric predictor that interacts with at least one factor in specs. Then instead of computing least-squares means, we compute and compare the slopes of the var trend over levels of the specified other predictor(s). As in least-squares means, marginal averages are computed when some variables in the reference grid are excluded for the specification.

The user may specify some monotone function of one variable, e.g., var = "log(dose)". If so, the chain rule is applied. Note that, in this example, if model contains log(dose) as a predictor, we will be comparing the slopes estimated by that model, whereas specifying var = "dose" would perform a transformation of those slopes.

delta.var

The value of h to use in forming the difference quotient (f(x+h) - f(x))/h. Changing it (especially changing its sign) may be necessary to avoid numerical problems such as logs of negative numbers. The default value is 1/100 of the range of var over the dataset.

data

As in ref.grid, you may use this argument to supply the dataset used in fitting the model, for situations where it is not possible to reconstruct the data. Otherwise, leave it missing.

bhat

Numeric. Vector of regression coefficients.

V

Square matrix. Covariance matrix of bhat

levels

Named list or vector. Levels of factor(s) that define the estimates defined by linfct. If not a list, we assume one factor named "level"

linfct

Matrix. Linear functions of bhat for each combination of levels

df

Numeric or function with arguments x,dfargs). If a number, that is used for the degrees of freedom. If a function, it should return the degrees of freedom for sum(x*bhat); if additional parameters are needed, include them in ... as dfargs (not abbreviated).

post.beta

Matrix whose columns comprise a sample from the posterior distribution of the regression coefficients (so that typically, the column averages will be bhat). A 1 x 1 matrix of NA indicates that such a sample is unavailable.

...

Additional arguments passed to other methods or to ref.grid. For example, vcov. may be used to override the default covariance estimate, and some models allow additional options. Some models require data to be given explicitly. See the help pages for ref.grid and models. In addition, if the model formula contains references to variables that are not predictors, you must provide a params argument with a list of their names; see the example below for Oatsq.lm.

Details

Least-squares means (also called predicted marginal means) are predictions from a linear model over a reference grid, or marginal averages thereof. They have been popularized by SAS (SAS Institute, 2012). The ref.grid function identifies/creates the reference grid upon which lsmeans is based.

For those who prefer the term “predicted marginal means”, courtesy wrappers pmmeans, pmtrends, and pmmobj are provided that behave identically to those that start with ls, except that estimates are relabeled accordingly (e.g., lsmean becomes pmmean).

If specs is a formula, it should be of the form contr ~ specs | by. The formula is parsed and then used as the arguments contr, specs, and by as indicated. The left-hand side is optional, but if specified it should be the name of a contrast family (e.g., pairwise) or of a sub-list of contr.list. Operators like * or : are necessary to delineate names in the formulas, but otherwise are ignored.

A number of standard contrast families are provided. They can be identified as functions having names ending in .lsmc – use

ls("package:lsmeans", pat=".lsmc")

to list them. See the documentation for pairwise.lsmc and its siblings for details. You may write your own .lsmc function for custom contrasts.

The function lsmobj may be used to construct an object just like one returned by lsmeans from user-specified coefficients, covariance matrix, levels (or row labels), linear functions for each row, and degrees of freedom. After the object is constructed, it is updateed with any additional arguments in ....

Value

When specs is a character vector or one-sided formula, an object of class lsmobj. A number of methods are provided for further analysis, including summary, confint, test, contrast, pairs, and cld.

When specs is a list or a formula having a left-hand side, the eturn value is an lsm.list object, which is simply a list of lsmobj objects. Methods for lsm.list objects are the same as those for lsmobj, but they apply to only one member of the list, determined by its which argument.

Side effect: When object is a model, a reference grid is constructed and it is saved as .Last.ref.grid in the user's enironment (unless this is disabled via lsm.option(save.ref.grid = FALSE)). This makes it possible to check what reference grid was used, or to use it as the object in future lsmeans calls (and bypass reconstructing it). Similarly, lstrends also saves its reference grid (but for predicting difference quotients) in the same variable.

Note

If the model formula contains variables that are not predictors (e.g., degree of a polynomial, knots for a spline, etc.), you must add a params argument to the call

Note

While using specs as a two-sided formula or a list is a convenient way to get a lot of results with minimal effort, it can also create confusion when additional arguments are provided, because not all arguments may be applied to all the results produced (see examples). Thus, the safer route is to do things incrementally.

Note

lsmeans and its relatives can produce fatal errors or incorrect results with models containing splines (e.g., ns) and other smoothers because the required information to reconstruct their basis is not always available. A model with poly involving two or more predictors will almost always produce misleading results without any warning; but poly(..., raw = TRUE) will work correctly.

Note

For a ref.grid or lsmobj object created in lsmeans version 2.10 or earlier, the information needed by the weights argument is not present; so a message is displayed and averaging is done using fac.reduce.

Author(s)

Russell V. Lenth

References

SAS Institute Inc. (2012) Online documentation; Shared concepts; LSMEANS statement, http://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#statug_introcom_a0000003362.htm, accessed August 15, 2012.

See Also

ref.grid, .Last.ref.grid, models, pairwise.lsmc, glht, lsm.options

Examples

require(lsmeans)

### Covariance example (from Montgomery Design (8th ed.), p.656)
# Uses supplied dataset 'fiber'
fiber.lm <- lm(strength ~ diameter + machine, data = fiber)

# adjusted means and comparisons, treating machine C as control
( fiber.lsm <- lsmeans (fiber.lm, "machine") )
contrast(fiber.lsm, "trt.vs.ctrlk")
# Or get both at once using
#     lsmeans (fiber.lm, "machine", contr = "trt.vs.ctrlk")


### Factorial experiment
warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks)
( warp.lsm <- lsmeans (warp.lm,  ~ wool | tension,
    options = list(estName = "pred.breaks")) )
pairs(warp.lsm) # remembers 'by' structure
contrast(warp.lsm, method = "poly", by = "wool")


### Unbalanced split-plot example ###
#-- The imbalance is imposed deliberately to illustrate that
#-- the variance estimates become biased
require(nlme)
Oats.lme <- lme(yield ~ factor(nitro) + Variety, 
    random = ~1 | Block/Variety, 
    subset = -c(1,2,3,5,8,13,21,34,55), data = Oats)
(Oats.anal <- lsmeans(Oats.lme, list(poly ~ nitro, pairwise ~ Variety)))


### Issues with lists of specs
test(Oats.anal)                # Uses 1st element by default
confint(Oats.anal, which = 4)  # or confint(Oats.anal[[4]])

# Using 'pmmeans' wrapper ...
pmmeans(warp.lm, ~ wool, 
    options = list(infer = c(TRUE, TRUE), null = 22, side = ">"))

### Weights
# See what's being averaged over in the above
lsmeans(Oats.lme, ~ nitro, cov.reduce = FALSE, weights = "show.levels")

# Give three times the weight to Marvellous
lsmeans(Oats.lme, ~ nitro, cov.reduce = FALSE, weights = c(1,3,1))


### Model with a quadratic trend for 'nitro'
# Also illustrates use of `params` argument to list non-predictors
deg = 2
Oatsq.lm <- lm(yield ~ Block + poly(nitro, deg) + Variety, data = Oats)
# Predictions at each unique 'nitro' value in the dataset
lsmeans(Oatsq.lm, ~ nitro, cov.reduce = FALSE, params = "deg")


### Trends
fiber.lm <- lm(strength ~ diameter*machine, data=fiber)
# Obtain slopes for each machine ...
( fiber.lst <- lstrends(fiber.lm, "machine", var="diameter") )
# ... and pairwise comparisons thereof
pairs(fiber.lst)

# Suppose we want trends relative to sqrt(diameter)...
lstrends(fiber.lm, ~ machine | diameter, var = "sqrt(diameter)", 
    at = list(diameter = c(20,30)))

# Given summary statistics for 4 cities computed elsewhere, 
# obtain multiple comparisons of their means using the
# Satterthwaite method
ybar <- c(47.6, 53.2, 88.9, 69.8)
s <-    c(12.1, 19.5, 22.8, 13.2)
n <-    c(44,   11,   37,   24)
se2 = s^2 / n
Satt.df <- function(x, dfargs)
    sum(x * dfargs$v)^2 / sum((x * dfargs$v)^2 / (dfargs$n - 1))
city.pmm <- pmmobj(bhat = ybar, V = diag(se2),
    levels = list(city = LETTERS[1:4]), linfct = diag(c(1,1,1,1)),
    df = Satt.df, dfargs = list(v = se2, n = n), estName = "mean")
city.pmm
contrast(city.pmm, "revpairwise")
    
    
# See also many other examples in documentation for 
# 'contrast', 'cld', 'glht', 'lsmip', 'ref.grid', 'MOats',
# 'nutrition', etc., and in the vignettes

Results