Last data update: 2014.03.03

R: Fits generalized boosted logistic regression models based on...
tsp.gbmR Documentation

Fits generalized boosted logistic regression models based on Top Scoring Pairs.

Description

Fits generalized boosted logistic regression models based on Top Scoring Pairs.

Usage

tsp.gbm(x, y, offset = NULL, misc = NULL, distribution = "bernoulli", w = NULL, var.monotone = NULL, n.trees = 100, interaction.depth = 1, n.minobsinnode = 10, shrinkage = 0.001, bag.fraction = 0.5, train.fraction = 1, keep.data = TRUE, verbose = TRUE)

Arguments

x

input matrix, of dimension nobs x nvars; each row is an observation vector.

y

response variable.

offset

a vector of values for the offset

misc

is an R object that is simply passed on to the gbm engine. (refer to "gbm.fit" function in the "gbm" package)

distribution

A character string specifying the name of the distribution to use or a list with a component. The default value is "bernoulli" for logistic regression.

w

w is a vector of weights of the same length as the y.

var.monotone

an optional vector, the same length as the number of predictors, indicating which variables have a monotone increasing (+1), decreasing (-1), or arbitrary (0) relationship with the outcome.

n.trees

the total number of trees to fit. This is equivalent to the number of iterations and the number of basis functions in the additive expansion.

interaction.depth

The maximum depth of variable interactions. 1 implies an additive model, 2 implies a model with up to 2-way interactions, etc.

n.minobsinnode

minimum number of observations in the trees terminal nodes. Note that this is the actual number of observations not the total weight.

shrinkage

a shrinkage parameter applied to each tree in the expansion. Also known as the learning rate or step-size reduction.

bag.fraction

the fraction of the training set observations randomly selected to propose the next tree in the expansion.

train.fraction

The first train.fraction * nrows(data) observations are used to fit the gbm and the remainder are used for computing out-of-sample estimates of the loss function.

keep.data

a logical variable indicating whether to keep the data and an index of the data stored with the object.

verbose

If TRUE, tsp.gbm will print out progress and performance indicators.

Value

See "gbm" package for returned values

Author(s)

Xiaolin Yang, Han Liu

References

See references for the "gbm" package.

See Also

predict.tsp.gbm

Examples

library(gbm)
x=matrix(rnorm(100*20),100,20)
y=rbinom(100,1,0.5)
fit=tsp.gbm(x,y)
predict(fit,x[1:10,],n.trees=5)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(BigTSP)
Loading required package: glmnet
Loading required package: Matrix
Loading required package: foreach
Loaded glmnet 2.0-5

Loading required package: tree
Loading required package: randomForest
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
Loading required package: gbm
Loading required package: survival
Loading required package: lattice
Loading required package: splines
Loading required package: parallel
Loaded gbm 2.1.1
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/BigTSP/tsp.gbm.Rd_%03d_medium.png", width=480, height=480)
> ### Name: tsp.gbm
> ### Title: Fits generalized boosted logistic regression models based on Top
> ###   Scoring Pairs.
> ### Aliases: tsp.gbm
> ### Keywords: ~kwd1 ~kwd2
> 
> ### ** Examples
> 
> library(gbm)
> x=matrix(rnorm(100*20),100,20)
> y=rbinom(100,1,0.5)
> fit=tsp.gbm(x,y)
Iter   TrainDeviance   ValidDeviance   StepSize   Improve
     1        1.3846            -nan     0.0010   -0.0001
     2        1.3845            -nan     0.0010   -0.0000
     3        1.3844            -nan     0.0010   -0.0000
     4        1.3843            -nan     0.0010   -0.0000
     5        1.3842            -nan     0.0010   -0.0001
     6        1.3841            -nan     0.0010   -0.0000
     7        1.3840            -nan     0.0010   -0.0001
     8        1.3839            -nan     0.0010   -0.0002
     9        1.3838            -nan     0.0010   -0.0001
    10        1.3838            -nan     0.0010   -0.0001
    20        1.3825            -nan     0.0010   -0.0001
    40        1.3804            -nan     0.0010   -0.0001
    60        1.3781            -nan     0.0010   -0.0000
    80        1.3764            -nan     0.0010   -0.0000
   100        1.3739            -nan     0.0010   -0.0000

Warning message:
In gbm.fit(newx, y, offset = offset, misc = misc, distribution = distribution,  :
  Parameter 'train.fraction' of gbm.fit is deprecated, please specify 'nTrain' instead
> predict(fit,x[1:10,],n.trees=5)
 [1] 0.07929928 0.08240278 0.08248181 0.07791368 0.08100316 0.08100316
 [7] 0.08243219 0.07905866 0.07760942 0.08095354
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>