R: Random Forest Cross-Valdidation for feature selection
rfcv
R Documentation
Random Forest Cross-Valdidation for feature selection
Description
This function shows the cross-validated prediction performance of
models with sequentially reduced number of predictors (ranked by
variable importance) via a nested cross-validation procedure.
corresponding vector of error rates or MSEs at each
step
predicted
list of n.var components, each containing
the predicted values from the cross-validation
Author(s)
Andy Liaw
References
Svetnik, V., Liaw, A., Tong, C. and Wang, T., “Application of Breiman's
Random Forest to Modeling Structure-Activity Relationships of
Pharmaceutical Molecules”, MCS 2004, Roli, F. and Windeatt, T. (Eds.)
pp. 334-343.
See Also
randomForest, importance
Examples
set.seed(647)
myiris <- cbind(iris[1:4], matrix(runif(96 * nrow(iris)), nrow(iris), 96))
result <- rfcv(myiris, iris$Species, cv.fold=3)
with(result, plot(n.var, error.cv, log="x", type="o", lwd=2))
## The following can take a while to run, so if you really want to try
## it, copy and paste the code into R.
## Not run:
result <- replicate(5, rfcv(myiris, iris$Species), simplify=FALSE)
error.cv <- sapply(result, "[[", "error.cv")
matplot(result[[1]]$n.var, cbind(rowMeans(error.cv), error.cv), type="l",
lwd=c(2, rep(1, ncol(error.cv))), col=1, lty=1, log="x",
xlab="Number of variables", ylab="CV Error")
## End(Not run)