R: Fit a linear model regularized by the nonconvex MC+ sparsity...
sparsenet
R Documentation
Fit a linear model regularized by the nonconvex MC+ sparsity penalty
Description
Sparsenet uses coordinate descent on the MC+ nonconvex penalty family,
and fits a surface of solutions over the two-dimensional parameter
space. This penalty family is indexed by an overall strength paramter lambda
(like lasso), and a convexity parameter gamma. Gamma = infinity
corresponds to the lasso, and gamma = 1 best subset.
Observation weights; default 1 for each observation
exclude
Indices of variables to be excluded from the
model. Default is none.
dfmax
Limit the maximum number of variables in the
model. Useful for very large nvars, if a partial path is desired.
pmax
Limit the maximum number of variables ever to be nonzero
ngamma
Number of gamma values, if gamma not supplied; default is 9.
nlambda
Number of lambda values, if lambda not
supplied; default is 50
max.gamma
Largest gamma value to be used, apart from infinity
(lasso), if gamma not supplied; default is 150
min.gamma
Smallest value of gamma to use, and should be >1;
default is 1.000001
lambda.min.ratio
Smallest value for lambda, as a fraction of
lambda.max, the (data derived) entry value (i.e. the smallest
value for which all coefficients are zero). The default depends on the
sample size nobs relative to the number of variables
nvars. If nobs > nvars, the default is 0.0001,
close to zero. If nobs < nvars, the default is 0.01.
A very small value of
lambda.min.ratio will lead to a saturated fit in the nobs <
nvars case.
lambda
A user supplied lambda sequence, in decreasing order. Typical usage
is to have the
program compute its own lambda sequence based on
nlambda and lambda.min.ratio. Supplying a value of
lambda overrides this. WARNING: use with care. Do not supply
a single value for lambda (for predictions after CV use predict()
instead). Supply instead
a decreasing sequence of lambda values. sparsenet relies
on its warms starts for speed, and its often faster to fit a whole
path than compute a single fit.
gamma
Sparsity parameter vector, with 1<gamma<infty. Gamma=1 corresponds to
best-subset regression, gamma=infty to the lasso. Should be given in
decreasing order.
parms
An optional three-dimensional array: 2x ngamma x nlambda.
Here the user can supply exactly the gamma, lambda pairs that are to
be traversed by the coordinate descent algorithm.
warm
How to traverse the grid. Default is "lambda", meaning warm starts from
the previous lambda with the same gamma. "gamma" means the opposite,
previous gamma for the same lambda. "both" tries both warm starts, and
uses the one that improves the criterion the most.
thresh
Convergence threshold for coordinate descent. Each
coordinate-descent loop continues until the maximum change in the
objective after any coefficient update is less than thresh
times the null Rss. Defaults value is 1E-5.
maxit
Maximum number of passes over the data for all lambda/gamma
values; default is 10^6.
Details
This algorithm operates like glmnet, with its alpha parameter
which moves the penalty between lasso and ridge; here gamma moves it
between lasso and best subset.
The algorithm traverses the two dimensional gamma/lambda array in a nested loop, with
decreasing gamma in the outer loop, and decreasing lambda in the inner
loop. Because of the nature of the MC+ penalty, each coordinate update
is a convex problem, with a simple two-threshold shrinking scheme:
beta< lambda set to zero; beta > lambda*gamma leave alone; beta
inbetween, shrink proportionally. Note that this algorithm ALWAYS
standardizes the columns of x and y to have mean zero and variance 1
(using the 1/N averaging) before it computes its fit. The
coefficients reflect the original scale.
Value
An object of class "sparsenet", with a number of
components. Mostly one will access the components via generic
functions
like coef(), plot(), predict() etc.
call
the call that produced this object
rsq
The percentage variance explained on the training data;
an ngamma x nlambda matrix.
jerr
error flag, for warnings and errors (largely for internal debugging).
coefficients
A coefficient list with ngamma elements; each of
these is a coefficient list with various components: the matrix beta
of coefficients, its dimension dim, the vector of intercepts, the lambda sequence, the gamma value, the sequence
of df (nonzero coefficients) for each solution.
parms
Irrespective how the parameters were input, the three-way
array of what was used.