An object of class (rfsrc, grow) or
(rfsrc, forest). Requires forest=TRUE in the
original rfsrc call.
xvar.names
Names of the x-variables to be used. If not
specified all variables are used.
outcome.target
Character vector for multivariate families
specifying the target outcomes to be used. The default is to use all
coordinates.
importance
Type of VIMP.
joint
Individual or joint VIMP?
subset
Vector indicating which rows of the grow data to
restrict VIMP calculations to; i.e. this option yields VIMP which is
restricted to a specific subset of the data. Note that the vector
should correspond to the rows of object$xvar and not the
original data passed in the grow call. All rows used if not
specified.
seed
Negative integer specifying seed for the random number
generator.
do.trace
Number of seconds between updates to the user on
approximate time to completion.
...
Further arguments passed to or from other methods.
Details
Using a previously grown forest, calculate the VIMP for variables
xvar.names. By default, VIMP is calculated for the original
data, but the user can specify a new test data for the VIMP
calculation using newdata. Depending upon the option
importance, VIMP is calculated either by random daughter
assignment or by random permutation of the variable(s). The default
is Breiman-Cutler permutation VIMP. See rfsrc for more
details.
Joint VIMP is requested using joint. The joint VIMP is the
importance for a group of variables when the group is perturbed
simultaneously.
Value
An object of class (rfsrc, predict), which is a list with
the following key components:
err.rate
OOB error rate for the ensemble restricted to the
subsetted data.
importance
Variable importance (VIMP).
Author(s)
Hemant Ishwaran and Udaya B. Kogalur
References
Ishwaran H. (2007). Variable importance in binary regression
trees and forests, Electronic J. Statist., 1:519-537.
See Also
rfsrc
Examples
## Not run:
## ------------------------------------------------------------
## classification example
## showcase different vimp
## ------------------------------------------------------------
iris.obj <- rfsrc(Species ~ ., data = iris)
# Breiman-Cutler permutation vimp
print(vimp(iris.obj)$importance)
# Breiman-Cutler random daughter vimp
print(vimp(iris.obj, importance = "random")$importance)
# Breiman-Cutler joint permutation vimp
print(vimp(iris.obj, joint = TRUE)$importance)
# Breiman-Cuter paired vimp
print(vimp(iris.obj, c("Petal.Length", "Petal.Width"), joint = TRUE)$importance)
print(vimp(iris.obj, c("Sepal.Length", "Petal.Width"), joint = TRUE)$importance)
## ------------------------------------------------------------
## regression example
## compare Breiman-Cutler vimp to ensemble based vimp
## ------------------------------------------------------------
airq.obj <- rfsrc(Ozone ~ ., airquality)
vimp.all <- cbind(
ensemble = vimp(airq.obj, importance = "permute.ensemble")$importance,
breimanCutler = vimp(airq.obj, importance = "permute")$importance)
print(vimp.all)
## ------------------------------------------------------------
## regression example
## calculate VIMP on test data
## ------------------------------------------------------------
set.seed(100080)
train <- sample(1:nrow(airquality), size = 80)
airq.obj <- rfsrc(Ozone~., airquality[train, ])
#training data vimp
print(airq.obj$importance)
print(vimp(airq.obj)$importance)
#test data vimp
print(vimp(airq.obj, newdata = airquality[-train, ])$importance)
## ------------------------------------------------------------
## survival example
## study how vimp depends on tree imputation
## makes use of the subset option
## ------------------------------------------------------------
data(pbc, package = "randomForestSRC")
# determine which records have missing values
which.na <- apply(pbc, 1, function(x){any(is.na(x))})
# impute the data using na.action = "na.impute"
pbc.obj <- rfsrc(Surv(days,status) ~ ., pbc, nsplit = 3,
na.action = "na.impute", nimpute = 1)
# compare vimp based on records with no missing values
# to those that have missing values
# note the option na.action="na.impute" in the vimp() call
vimp.not.na <- vimp(pbc.obj, subset = !which.na, na.action = "na.impute")$importance
vimp.na <- vimp(pbc.obj, subset = which.na, na.action = "na.impute")$importance
print(data.frame(vimp.not.na, vimp.na))
## End(Not run)