Last data update: 2014.03.03

R: Prediction from a random generalized linear model predictor
predict.randomGLMR Documentation

Prediction from a random generalized linear model predictor


Implements a predict method on a previously-constructed random generalized linear model predictor and new data.


## S3 method for class 'randomGLM'
predict(object, newdata, type=c("response", "class"), 
                 thresholdClassProb = object$thresholdClassProb, ...)



a randomGLM object such as one returned by randomGLM.


specification of test data for which to calculate the prediction.


type of prediction required. Type "response" gives the fitted probabilities for classification, the fitted values for regression. Type "class" applies only to classification, and produces the predicted class labels.


the threshold of predictive probabilities to arrive at classification. Takes values between 0 and 1. Only used for binary outcomes.


other arguments that may be passed to and from methods. Currently unused.


The function calculates prediction on new test data. It only works if object contains the regression models that were used to construct the predictor (see argument keepModels of the function randomGLM).

If the predictor was trained on a multi-class response, the prediction is applied to each of the representing binary variables (see randomGLM for details).


For continuous prediction, the predicted values. For classification of binary response, predicted class when type="class"; or a two-column matrix giving the class probabilities if type="response".

If the predictor was trained on a multi-class response, the returned value is a matrix of "cbind"-ed results for the representing individual binary variables (see randomGLM for details).


Lin Song, Steve Horvath and Peter Langfelder.


Lin Song, Peter Langfelder, Steve Horvath: Random generalized linear model: a highly accurate and interpretable ensemble predictor. BMC Bioinformatics (2013)


## binary outcome prediction
# data generation
# Restrict data to first 100 observations
# Turn Species into a factor
iris$Species = as.factor(as.character(iris$Species))
# Select a training and a test subset of the 100 observations
indx = sample(100, 67, replace=FALSE)
xyTrain = iris[indx,]
xyTest = iris[-indx,]
xTrain = xyTrain[, -5]
yTrain = xyTrain[, 5]

xTest = xyTest[, -5]
yTest = xyTest[, 5]

# predict with a small number of bags - normally nBags should be at least 100.
RGLM = randomGLM(xTrain, yTrain, nCandidateCovariates=ncol(xTrain), nBags=30, keepModels = TRUE, nThreads = 1)
predicted = predict(RGLM, newdata = xTest, type="class")
table(predicted, yTest)

## continuous outcome prediction


xTrain = x[1:50,]
yTrain = y[1:50]
xTest = x[51:100,]
yTest = y[51:100]

RGLM = randomGLM(xTrain, yTrain, classify=FALSE, nCandidateCovariates=ncol(xTrain), 
                 nBags=10, keepModels = TRUE, nThreads = 1)
predicted = predict(RGLM, newdata = xTest)
