a formula, as in the gam function. Smoothing splines are supported
as nonparametric smoothing terms, and should be indicated by s. See the documentation of s in the
gam package for its arguments. The GAMens function also provides the possibility for automatic
formula specification. See 'details' for more information.
data
a data frame in which to interpret the variables named in formula.
rsm_size
an integer, the number of variables to use for random feature subsets used in the Random Subspace Method. Default is 2.
If rsm=FALSE, the value of rsm_size is ignored.
autoform
if FALSE (default), the model specification in formula is used. If TRUE,
the function triggers automatic formula specification. See 'details' for more information.
iter
an integer, the number of base classifiers (GAMs) in the ensemble. Defaults to iter=10
base classifiers.
df
an integer, the number of degrees of freedom (df) used for smoothing spline estimation. Its value
is only used when autoform = TRUE. Defaults to df=4. Its value is ignored if a formula is
specified and autoform is FALSE.
bagging
enables Bagging if value is TRUE (default). If FALSE,
Bagging is disabled. Either bagging, rsm or both should be TRUE
rsm
enables Random Subspace Method (RSM) if value is TRUE (default). If FALSE,
RSM is disabled. Either bagging, rsm or both should be TRUE
fusion
specifies the fusion rule for the aggregation of member classifier outputs in the ensemble. Possible values are
'avgagg' (default), 'majvote', 'w.avgagg' or 'w.majvote'.
Details
The GAMens function applies the GAMbag, GAMrsm or GAMens ensemble classifiers (De Bock et al., 2010) to a data set. GAMens is
the default with (bagging=TRUE and rsm=TRUE. For GAMbag, rsm should be specified as FALSE.
For GAMrsm, bagging should be FALSE.
The GAMens function provides the possibility for automatic formula specification. In this case,
dichotomous variables in data are included as linear terms, and other variables are assumed continuous,
included as nonparametric terms, and estimated by means of smoothing splines. To enable automatic formula specification,
use the generic formula [response variable name]~. in combination with autoform = TRUE. Note that in this case,
all variables available in data are used in the model. If a formula other than [response variable name]~. is specified
then the autoform option is automatically overridden. If autoform=FALSE and the generic formula [response variable name]~.
is specified then the GAMs in the ensemble will not contain nonparametric terms (i.e., will only consist of linear terms).
Four alternative fusion rules for member classifier outputs can be specified. Possible values are
'avgagg' for average aggregation (default), 'majvote' for majority voting, 'w.avgagg' for
weighted average aggregation, or 'w.majvote' for weighted majority
voting. Weighted approaches are based on member classifier error rates.
Value
An object of class GAMens, which is a list with the following components:
GAMs
the member GAMs in the ensemble.
formula
the formula used tot create the GAMens object.
iter
the ensemble size.
df
number of degrees of freedom (df) used for smoothing spline estimation.
rsm
indicates whether the Random Subspace Method was used to create the GAMens object.
bagging
indicates whether bagging was used to create the GAMens object.
rsm_size
the number of variables used for random feature subsets.
fusion_method
the fusion rule that was used to combine member classifier outputs in the ensemble.
probs
the class membership probabilities, predicted by the ensemble classifier.
class
the class predicted by the ensemble classifier.
samples
an array indicating, for every base classifier in the ensemble, which observations were used for training.
weights
a vector with weights defined as (1 - error rate). Usage depends upon specification of fusion_method.
De Bock, K. W. and Van den Poel, D. (2012): "Reconciling Performance and Interpretability in Customer Churn Prediction Modeling Using Ensemble Learning Based on Generalized Additive Models". Expert Systems With Applications, Vol 39, 8, pp. 6816–6826.
De Bock, K. W., Coussement, K. and Van den Poel, D. (2010): "Ensemble Classification based on generalized additive models". Computational Statistics & Data Analysis, Vol 54, 6, pp. 1535–1546.
Breiman, L. (1996): "Bagging predictors". Machine Learning, Vol 24, 2, pp. 123–140.
Hastie, T. and Tibshirani, R. (1990): "Generalized Additive Models", Chapman and Hall, London.
Ho, T. K. (1998): "The random subspace method for constructing decision forests". IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol 20, 8, pp. 832-844.
See Also
predict.GAMens,
GAMens.cv
Examples
## Load data (mlbench library should be loaded)
library(mlbench)
data(Ionosphere)
## Train a GAMens ensemble,
## estimated using 4 nonparametric terms and 2 linear terms
Ionosphere.GAMens <- GAMens(Class~s(V3,4)+s(V4,4)+s(V5,3)+s(V6,5)+V7+V8,
Ionosphere ,3 , autoform=FALSE, iter=10 )
## Calculate AUC (for function colAUC, load caTools library) for the GAMens model
library(caTools)
GAMens.auc <- colAUC(Ionosphere.GAMens[[9]], Ionosphere["Class"]=="good",
plotROC=FALSE)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(GAMens)
Loading required package: splines
Loading required package: gam
Loading required package: foreach
Loaded gam 1.12
Loading required package: mlbench
Loading required package: caTools
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/GAMens/GAMens.Rd_%03d_medium.png", width=480, height=480)
> ### Name: GAMens
> ### Title: Applies the GAMbag, GAMrsm or GAMens ensemble classifier to a
> ### data set
> ### Aliases: GAMens
> ### Keywords: models classif
>
> ### ** Examples
>
>
> ## Load data (mlbench library should be loaded)
> library(mlbench)
> data(Ionosphere)
>
> ## Train a GAMens ensemble,
> ## estimated using 4 nonparametric terms and 2 linear terms
> Ionosphere.GAMens <- GAMens(Class~s(V3,4)+s(V4,4)+s(V5,3)+s(V6,5)+V7+V8,
+ Ionosphere ,3 , autoform=FALSE, iter=10 )
>
> ## Calculate AUC (for function colAUC, load caTools library) for the GAMens model
> library(caTools)
> GAMens.auc <- colAUC(Ionosphere.GAMens[[9]], Ionosphere["Class"]=="good",
+ plotROC=FALSE)
>
>
>
>
>
>
> dev.off()
null device
1
>