R: Package to calculate relative importance metrics for linear...
relaimpo-package
R Documentation
Package to calculate relative importance metrics for linear models
Description
relaimpo calculates several relative importance metrics for the linear model.
The recommended metrics are lmg (R^2 partitioned by averaging over orders, like in Lindemann, Merenda and Gold (1980, p.119ff))
and pmvd (a newly proposed metric by Feldman (2005), non-US version only).
For completeness, several other metrics are also on offer. Other packages with related topics: hier.part, relimp.
Details
relaimpo calculates the metrics and also offers the possibility of bootstrapping them and of displaying results in print and graphically.
It is possible to designate a subset of variables as adjustment variables that always stay in the model so that relative importance is only
assessed among the remaining variables.
Models can have up to 2-way interactions that are treated hierarchically - i.e. an interaction is only allowed in a model that also contains all its main effects.
In case of interactions, only metric lmg can be used.
Observations with missing values are by default excluded from the analysis for most functions.
The function mianalyze.relimp allows to draw conclusions from a set of multiply imputed data sets.
This function is currently more restrictive than the rest of the package in terms of data types and models
that can be used (when summarizing the multiply imputed samples without calculating confidence intervals,
all possibilities available elsewhere are also applicable in mianalyze.relimp).
relaimpo does accomodate complex survey designs by making use of the facilities in package survey.
Currently, interactions and calculated variables cannot be combined with using a complex survey design in bootstrapping functions.
Acknowlegment
This package uses as an internal function the function nchoosek from vsn, authored by Wolfgang Huber, available under LGPL.
Furthermore, it uses a modified version of the function carscore from care by Verena Zuber and Korbinian Strimmer.
Warning
lmg and pmvd are computer-intensive. Although they are calculated based on the
covariance matrix, which saves substantial computing time in comparison to carrying out actual regressions,
these methods still take quite long for problems with many regressors. Obviously,
this is particularly relevant in combination with bootstrapping.
Note
There are two versions of this package. The version on CRAN is globally licensed under GPL version 2 (or later).
There is an extended version with the interesting additional metric pmvd that is licensed according to GPL version 2
under the geographical restriction "outside of the US" because of potential issues with US patent 6,640,204. This version can be obtained
from Ulrike Groempings website (cf. references section). Whenever you load the package, a display tells you, which version you are loading.
Author(s)
Ulrike Groemping, BHT Berlin
References
Chevan, A. and Sutherland, M. (1991) Hierarchical Partitioning. The American Statistician45, 90–96.
Darlington, R.B. (1968) Multiple regression in psychological research and practice. Psychological Bulletin69, 161–182.
Groemping, U. (2006) Relative Importance for Linear Regression in R: The Package relaimpo
Journal of Statistical Software17, Issue 1.
Downloadable at http://www.jstatsoft.org/v17/i01
Lindeman, R.H., Merenda, P.F. and Gold, R.Z. (1980) Introduction to Bivariate and Multivariate Analysis, Glenview IL: Scott, Foresman.