Fits cumulative link models (CLMs) such as the propotional odds
model. The model allows for various link functions and structured
thresholds that restricts the thresholds or cut-points to be e.g.,
equidistant or symmetrically arranged around the central
threshold(s). Nominal effects (partial proportional odds with the
logit link) are also allowed.
A modified Newton algorithm is used to optimize the likelihood function.
a formula expression as for regression models, of the form
response ~ predictors. The response should be a factor
(preferably an ordered factor), which will be interpreted as an
ordinal response with levels ordered as in the factor.
The model must have an intercept: attempts to remove one will
lead to a warning and will be ignored. An offset may be used. See the
documentation of formula for other details.
scale
an optional formula expression, of the form
~ predictors, i.e. with an empty left hand side.
An offset may be used. Variables included here will have
multiplicative effects and can be interpreted as effects on the
scale (or dispersion) of a latent distribution.
nominal
an optional formula of the form ~ predictors, i.e. with an
empty left hand side. The effects of the predictors in this formula
are assumed to be nominal rather than ordinal -
this corresponds to the so-called partial
proportional odds (with the logit link).
data
an optional data frame in which to interpret the variables occurring
in the formulas.
weights
optional case weights in fitting. Defaults to 1. Negative weights
are not allowed.
start
initial values for the parameters in the format
c(alpha, beta, zeta), where alpha are the threshold
parameters (adjusted for potential nominal effects), beta are the
regression parameters and zeta are the scale parameters.
subset
expression saying which subset of the rows of the data should be used
in the fit. All observations are included by default.
doFit
logical for whether the model should be fit or the model
environment should be returned.
na.action
a function to filter missing data. Applies to terms in all three
formulae.
contrasts
a list of contrasts to be used for some or all of
the factors appearing as variables in the model formula.
model
logical for whether the model frame should be part of the returned
object.
control
a list of control parameters passed on to
clm.control.
link
link function, i.e., the type of location-scale distribution
assumed for the latent distribution. The default "logit" link
gives the proportional odds model.
threshold
specifies a potential structure for the thresholds
(cut-points). "flexible" provides the standard unstructured
thresholds, "symmetric" restricts the distance between the
thresholds to be symmetric around the central one or two thresholds
for odd or equal numbers or thresholds respectively,
"symmetric2" restricts the latent
mean in the reference group to zero; this means that the central
threshold (even no. response levels) is zero or that the two central
thresholds are equal apart from their sign (uneven no. response
levels), and
"equidistant" restricts the distance between consecutive
thresholds to be of the same size.
...
additional arguments are passed on to clm.control.
Details
This is a new (as of August 2011) improved implementation of CLMs. The
old implementation is available in clm2, but will
probably be removed at some point.
There are methods for the standard model-fitting functions, including
summary,
anova,
model.frame,
model.matrix,
drop1,
dropterm,
step,
stepAIC,
extractAIC,
AIC,
coef,
nobs,
profile,
confint,
vcov and
slice.
Value
If doFit = FALSE the result is an environment
representing the model ready to be optimized.
If doFit = TRUE the result is an
object of class "clm" with the components listed below.
Note that some components are only present if scale and
nominal are used.
aliased
list of length 3 or less with components alpha,
beta and zeta each being logical vectors containing
alias information for the parameters of the same names.
alpha
a vector of threshold parameters.
alpha.mat
(where relevant) a table (data.frame) of
threshold parameters where each row corresponds to an effect in the
nominal formula.
beta
(where relevant) a vector of regression parameters.
call
the mathed call.
coefficients
a vector of coefficients of the form
c(alpha, beta, zeta)
cond.H
Condition number of the Hessian matrix at the optimum
(i.e. the ratio of the largest to the smallest eigenvalue).
contrasts
(where relevant) the contrasts used for the
formula part of the model.
control
List of control parameters as generated by clm.control.
convergence
convergence code where 0 indicates successful
convergence; 1 indicates the iteration limit was reached before
convergence; 2 indicates the step factor was reduced below minimum
before convergence was reached; 3 indicates that thresholds are not
increasing (only possible with nominal effects).
edf
the estimated degrees of freedom, i.e., the number of
parameters in the model fit.
fitted.values
the fitted probabilities.
gradient
a vector of gradients for the coefficients at the
estimated optimum.
Hessian
the Hessian matrix for the parameters at the estimated
optimum.
info
a table of basic model information for printing.
link
character, the link function used.
logLik
the value of the log-likelihood at the estimated
optimum.
maxGradient
the maximum absolute gradient, i.e.,
max(abs(gradient)).
model
if requested (the default), the
model.frame containing variables from formula,
scale and nominal parts.
n
the number of observations counted as nrow(X), where
X is the design matrix.
na.action
(where relevant) information returned by
model.frame on the special handling of NAs.
nobs
the number of observations counted as sum(weights).
nom.contrasts
(where relevant) the contrasts used for the
nominal part of the model.
nom.terms
(where relevant) the terms object for the
nominal part.
nom.xlevels
(where relevant) a record of the levels of the
factors used in fitting for the nominal part.
start
the parameter values at which the optimization has
started. An attribute start.iter gives the number of
iterations to obtain starting values for models where scale
is specified or where the cauchit link is chosen.
S.contrasts
(where relevant) the contrasts used for the
scale part of the model.
S.terms
(where relevant) the terms object for the scale
part.
S.xlevels
(where relevant) a record of the levels of the
factors used in fitting for the scale part.
terms
the terms object for the formula part.
Theta
(where relevant) a table (data.frame) of
thresholds for all combinations of levels of factors in the
nominal formula.
threshold
character, the threshold structure used.
tJac
the transpose of the Jacobian for the threshold structure.
xlevels
(where relevant) a record of the levels of the factors
used in fitting for the formula part.
y.levels
the levels of the response variable after removing
levels for which all weights are zero.
zeta
(where relevant) a vector of scale regression parameters.
Author(s)
Rune Haubo B Christensen
Examples
fm1 <- clm(rating ~ temp * contact, data = wine)
fm1 ## print method
summary(fm1)
fm2 <- update(fm1, ~.-temp:contact)
anova(fm1, fm2)
drop1(fm1, test = "Chi")
add1(fm1, ~.+judge, test = "Chi")
fm2 <- step(fm1)
summary(fm2)
coef(fm1)
vcov(fm1)
AIC(fm1)
extractAIC(fm1)
logLik(fm1)
fitted(fm1)
confint(fm1) ## type = "profile"
confint(fm1, type = "Wald")
pr1 <- profile(fm1)
confint(pr1)
## plotting the profiles:
par(mfrow = c(2, 2))
plot(pr1, root = TRUE) ## check for linearity
par(mfrow = c(2, 2))
plot(pr1)
par(mfrow = c(2, 2))
plot(pr1, approx = TRUE)
par(mfrow = c(2, 2))
plot(pr1, Log = TRUE)
par(mfrow = c(2, 2))
plot(pr1, Log = TRUE, relative = FALSE)
## other link functions:
fm4.lgt <- update(fm1, link = "logit") ## default
fm4.prt <- update(fm1, link = "probit")
fm4.ll <- update(fm1, link = "loglog")
fm4.cll <- update(fm1, link = "cloglog")
fm4.cct <- update(fm1, link = "cauchit")
anova(fm4.lgt, fm4.prt, fm4.ll, fm4.cll, fm4.cct)
## structured thresholds:
fm5 <- update(fm1, threshold = "symmetric")
fm6 <- update(fm1, threshold = "equidistant")
anova(fm1, fm5, fm6)
## the slice methods:
slice.fm1 <- slice(fm1)
par(mfrow = c(3, 3))
plot(slice.fm1)
## see more at '?slice.clm'
## Another example:
fm.soup <- clm(SURENESS ~ PRODID, data = soup)
summary(fm.soup)
if(require(MASS)) { ## dropterm, addterm, stepAIC, housing
fm1 <- clm(rating ~ temp * contact, data = wine)
dropterm(fm1, test = "Chi")
addterm(fm1, ~.+judge, test = "Chi")
fm3 <- stepAIC(fm1)
summary(fm3)
## Example from MASS::polr:
fm1 <- clm(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
summary(fm1)
}