R: Fitting function for multinomial response models with...
MRSP.fit
R Documentation
Fitting function for multinomial response models with structured penalties
Description
This function performs the actual fitting of multinomial response models with structured penalties.
Function MRSP is actually just a user-friendly wrapper that prepares calls to this function.
MRSP.fit is designed mostly for internal use and therefore not user-friendly, so that it is
highly recommended to use MRSP instead of calling MRSP.fit directly.
A list that contains the data in the format required by MRSP.fit. dat
is a list with elements 'y', 'x' and, possibly, 'V'. If nobs individual
observations are available, dat$y must be a matrix of dimension nobs x K.
For model=multinomlogit(), K is equal to the number of categories of the
response variable, for model=sequentiallogit(), it equals the number of response
categories minus 1. In other words: K here always refers to the column rank of the response matrix.
(Note that this is for notational convenience and in contrast to the documentation of MRSP,
where K always refers to the number of categories of the multicategorical response!)
The entries of dat$y are either 0 or 1, with dat$y[i,r]==1
indicating that class r is observed for the i-th observation. For model=multinomlogit(),
the rowSums of dat$y must be all 1; for model=sequentiallogit(), they must be
either 1 or 0; with a row of zeros indicating that the last category was observed.
dat$x is a matrix of dimension nobs x p which contains covariates whose
value is constant across classes. They are called 'global predictors/covariates' in the
following. In the context of discrete choice modeling, they are often referred to as
'individual-specific' predictors.
If available, covariates whose value varies from class to class can be included in an entry
dat$V. Such variables are called 'category-specific' in the following since their value
depends on the categories of the response variable. In the literature on discrete choice modeling,
they are often referred to as 'alternative-specific'.
These variables can either be equipped with global or with category-specific coefficients.
If a total of L category-specific variables shall be used, dat$V must be a list (!)
of length K whose elements each are matrices of dimension nobs x L.
coef.init
An optional coefficient object supplying initial coefficient values to be used. A list whose first entry is
a matrix of dimension K x p, with row r containing the coefficients for class r
and column j containing the coefficients for global predictor x_j. If category-specific
predictors are included, the second entry of coef.init is a matrix of dimension K x L
that contains the coefficients for those category-specific predictors.
coef.stand.init
Optional initial coefficient values for the standardized predictors. Same structure as coef.init.
coef.pretres.init
Optional initial coefficient values, prior to potential thresholding, for the standardized predictors. Same structure as coef.init.
offset
An optional vector or matrix of offset values to be used. Either length nobs or dimension
nobs x K.
weights
An optional vector of observation weights of length nobs.
grpindex
A list of one or two integer vectors that indicate which columns of the design matrix form a group that has
to be penalized jointly, e.g. the different dummies of a categorical predictor. The first element is the
grouping vector for x, the optional second one for V. Those columns with the same number belong to one group.
The numbers must begin with 1 and increase with every group. An example:
grpindex = list(c(1,2,3,3,4,4,4)) means that variables 1 and 2 form their own, 'scalar' group; variables 3 and 4 as well
as variables 5, 6 and 7 form multi-parameter-groups.
penindex
A list of one or two vectors which specifies the exact penalty type to use for each covariate. The first entry
specifies the penalty type for the variables in dat$x, the (optional) second entry those for the variables
in dat$V. The following penalty types are available:
1: global predictor ('x') whose coefficients shall be penalized with
a group lasso penalty with grouping 'across' categories, i.e. CATS Lasso (see Tutz, Poessnecker and Uhlmann, 2015).
10: global predictor, unpenalized.
11: global predictor, sparse group lasso.
12: global predictor, ordinary lasso.
13: global predictor, ridge penalty. does not support penweights.
2: category-specific predictor with global coefficient which is penalized with
the ordinary (group-)lasso. (depending on grpindex.)
20: category-specific, unpenalized.
21: category-specific, ridge penalty.
3: category-speficic predictor with category-specific coefficients
that are penalized by a group lasso like in '1'.
30: category-specific with category-specific coefs, unpenalized.
31: category-specific with category-specific coefficients and sparse
group lasso penalty.
32: cat-cat-specific, with ordinary lasso.
33: cat-cat, with ridge. does not support penweights.
4: global predictor with global effect, penalized. (cf. the '2'-series).
40: global predictor, global effect, unpenalized.
41: global predictor, global effect, ridge penalty.
The '4-series' only makes sense for ordinal models!
lambda
Optional object specifying the lambda values to be used as tuning parameter(s) for
the main variable selection penalty. Either a vector or a single numeric. If missing,
a suitable grid of lambda values is computed. See arguments nrlambda, lambdamin and
lambdamax in function MRSP.
lambdaR
Lambda(s) to be used for ridge penalties. Typically, if only a ridge penalty and no other
penalty is used, one can specify the Ridge lambda via argument lambda instead.
lambdaF
Lambda(s) to be used for fusion penalties. Not available yet for end-users, but included for
compatibility with future releases of MRSP.
gamma
See argument gamma in MRSP.
psi
See argument psi in MRSP.
indg
A vector of the column indices of the category-specific variables that are equipped with
global coefficients.
indcs
A vector of the column indices of the category-specific variables that are equipped with
category-specific coefficients
model
An object of class MRSP.model that specifies the model to
be used. Currently, model = multinomlogit() and
model = sequentiallogit() are available, yielding a multinomial or
sequential logit model, respectively. Cumulative logit models will be included
in future versions of MRSP.
constr
The identifiability constraint to be used. The coefficients of
predictors which do not vary over categories (i.e. global/individual-specific
predictors) are not identifiable in (unpenalized) multinomial logit models.
If constr is an integer in [1, K], the corresponding class is used
as reference, which means that the coefficients of global predictors
for this class are set to 0. If constr = "symmetric", a symmetric side
constraint is used, which means that all coefficients belonging to the same
global predictor sum to zero. If constr = "none", no constraint is used
for penalized parameter groups and identifiability is ensured by the penalty
term (see Friedman, Hastie and Tibshirani, 2010.)
For ordinal regression, constr must take value "none".
If left unspecified, a symmetric side constraint is used for multinomial and
no constraint for ordinal models.
control
An object of class MRSP.control that stores control
information. It's slots max.iter and rel.tol specify the max
number of iterations and the relative change in penalized log-likelihood
values that indicates convergence. The other slots should not be changed
unless by experienced users.
fista.control
An object of class fista.control that contains control information for
the FISTA algorithm that is internally used to compute numerical estimates. Not
intended for end-users!
Proximal.control, Proximal.args
Arguments to be passed to the proximal gradient algorithm. Not intended for
end-users!
penweights
An optional list containing weights for the various penalty terms of different
coefficients or coefficient groups. Assuming that category-specific covariates
are present, the first element of penweights is a list of length two, with the
first element of penweights[[1]] being a numeric of length p that
contains the weights for the CATS penalty on the group of coefficients of the
global covariates for the response classes. The second element of penweights[[1]]
is a numeric of length L with the group penalty weights for the
category-specific variables.
The second element of penweights is again a list of length two. The first element
of penweights[[2]] is a K x p matrix with penalty weights for
unstructured lasso penalties on atomic coefficients belonging to global
predictors. The second element of penweights[[2]] is a K x L
matrix with penalty weights for unstructured lasso penalties on category-specific
covariates.
mlfit
A list that contains information about the ML or 'pseudo-ML' coefficients of the
specified model. It must contain at least one entry called 'coef.stand' that has
the same structure as the coefficient object (see coef.init). The
value of those parameters must be known to compute the effective degrees of
freedom of penalized parameter estimates with grouped penalties.
adaptive
Should adaptive weights be used? Use adaptive="ML" to
obtain the traditional adaptive weights proposed in the literature. Using
adaptive = TRUE computes the penalized estimator with whatever penalty is
specified and no adaptive weights and computes adaptive weights from the output
of this penalized model. The final output is the computed with those
adaptive weights. It is strongly recommended to prefer adaptive="ML"
over adaptive=TRUE.
threshold
If TRUE, the coefficients will be thresholded with an
appropriate threshold value. You can also specify an explicit nonnegative
value to be used as the threshold.
refit
Should refitting be performed? If TRUE, the model is first fit
traditionally, and then refitted on the active set found by this first fit.
This can improve variable selection, but tends to be rather slow and time-consuming.
fusion
If fusion penalties are used, this specifies the type of fusion.
Not yet supported for end-users of MRSP, but included
for compatibility with future releases of MRSP.
nonneg
If TRUE, all coefficients are restricted to be nonnegative.
...
Further arguments or objects to be passed to MRSP.fit.
Details
This function does the actual work of fitting multinomial response models with
structured penalties. It is intended mainly for internal use. The main purpose
of function MRSP is to provide a user-friendly wrapper that prepares
and evaluates a call to MRSP.fit.
Value
Depending on nrlambda, either an object of class MRSP or of
class MRSP.list, which are lists of length
nrlambda whose elements are MRSP objects.