Last data update: 2014.03.03

R: R/Weka Classifier Functions
Weka_classifier_functionsR Documentation

R/Weka Classifier Functions

Description

R interfaces to Weka regression and classification function learners.

Usage

LinearRegression(formula, data, subset, na.action,
                 control = Weka_control(), options = NULL)
Logistic(formula, data, subset, na.action,
         control = Weka_control(), options = NULL)
SMO(formula, data, subset, na.action,
    control = Weka_control(), options = NULL)

Arguments

formula

a symbolic description of the model to be fit.

data

an optional data frame containing the variables in the model.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NAs. See model.frame for details.

control

an object of class Weka_control giving options to be passed to the Weka learner. Available options can be obtained on-line using the Weka Option Wizard WOW, or the Weka documentation.

options

a named list of further options, or NULL (default). See Details.

Details

There are a predict method for predicting from the fitted models, and a summary method based on evaluate_Weka_classifier.

LinearRegression builds suitable linear regression models, using the Akaike criterion for model selection.

Logistic builds multinomial logistic regression models based on ridge estimation (le Cessie and van Houwelingen, 1992).

SMO implements John C. Platt's sequential minimal optimization algorithm for training a support vector classifier using polynomial or RBF kernels. Multi-class problems are solved using pairwise classification.

The model formulae should only use the + and - operators to indicate the variables to be included or not used, respectively.

Argument options allows further customization. Currently, options model and instances (or partial matches for these) are used: if set to TRUE, the model frame or the corresponding Weka instances, respectively, are included in the fitted model object, possibly speeding up subsequent computations on the object. By default, neither is included.

Value

A list inheriting from classes Weka_functions and Weka_classifiers with components including

classifier

a reference (of class jobjRef) to a Java object obtained by applying the Weka buildClassifier method to build the specified model using the given control options.

predictions

a numeric vector or factor with the model predictions for the training instances (the results of calling the Weka classifyInstance method for the built classifier and each instance).

call

the matched call.

References

J. C. Platt (1998). Fast training of Support Vector Machines using Sequential Minimal Optimization. In B. Schoelkopf, C. Burges, and A. Smola (eds.), Advances in Kernel Methods — Support Vector Learning. MIT Press.

I. H. Witten and E. Frank (2005). Data Mining: Practical Machine Learning Tools and Techniques. 2nd Edition, Morgan Kaufmann, San Francisco.

See Also

Weka_classifiers

Examples

## Linear regression:
## Using standard data set 'mtcars'.
LinearRegression(mpg ~ ., data = mtcars)
## Compare to R:
step(lm(mpg ~ ., data = mtcars), trace = 0)

## Using standard data set 'chickwts'.
LinearRegression(weight ~ feed, data = chickwts)
## (Note the interactions!)

## Logistic regression:
## Using standard data set 'infert'.
STATUS <- factor(infert$case, labels = c("control", "case"))
Logistic(STATUS ~ spontaneous + induced, data = infert)
## Compare to R:
glm(STATUS ~ spontaneous + induced, data = infert, family = binomial())

## Sequential minimal optimization algorithm for training a support
## vector classifier, using am RBF kernel with a non-default gamma
## parameter (argument '-G') instead of the default polynomial kernel
## (from a question on r-help):
SMO(Species ~ ., data = iris,
    control = Weka_control(K =
    list("weka.classifiers.functions.supportVector.RBFKernel", G = 2)))
## In fact, by some hidden magic it also "works" to give the "base" name
## of the Weka kernel class:
SMO(Species ~ ., data = iris,
    control = Weka_control(K = list("RBFKernel", G = 2)))

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(RWeka)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/RWeka/Weka_classifier_functions.Rd_%03d_medium.png", width=480, height=480)
> ### Name: Weka_classifier_functions
> ### Title: R/Weka Classifier Functions
> ### Aliases: Weka_classifier_functions LinearRegression Logistic SMO
> ### Keywords: models regression classif
> 
> ### ** Examples
> 
> ## Linear regression:
> ## Using standard data set 'mtcars'.
> LinearRegression(mpg ~ ., data = mtcars)
Jul 05, 2016 12:01:29 AM com.github.fommil.netlib.BLAS <clinit>
WARNING: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
Jul 05, 2016 12:01:29 AM com.github.fommil.netlib.BLAS <clinit>
WARNING: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
Jul 05, 2016 12:01:29 AM com.github.fommil.netlib.LAPACK <clinit>
WARNING: Failed to load implementation from: com.github.fommil.netlib.NativeSystemLAPACK
Jul 05, 2016 12:01:29 AM com.github.fommil.netlib.LAPACK <clinit>
WARNING: Failed to load implementation from: com.github.fommil.netlib.NativeRefLAPACK

Linear Regression Model

mpg =

     -3.9165 * wt +
      1.2259 * qsec +
      2.9358 * am +
      9.6178
> ## Compare to R:
> step(lm(mpg ~ ., data = mtcars), trace = 0)

Call:
lm(formula = mpg ~ wt + qsec + am, data = mtcars)

Coefficients:
(Intercept)           wt         qsec           am  
      9.618       -3.917        1.226        2.936  

> 
> ## Using standard data set 'chickwts'.
> LinearRegression(weight ~ feed, data = chickwts)

Linear Regression Model

weight =

     73.4538 * feed=linseed,soybean,meatmeal,casein,sunflower +
     43.2552 * feed=meatmeal,casein,sunflower +
     49.3409 * feed=casein,sunflower +
    160.2   
> ## (Note the interactions!)
> 
> ## Logistic regression:
> ## Using standard data set 'infert'.
> STATUS <- factor(infert$case, labels = c("control", "case"))
> Logistic(STATUS ~ spontaneous + induced, data = infert)
Logistic Regression with ridge parameter of 1.0E-8
Coefficients...
                 Class
Variable       control
======================
spontaneous    -1.1972
induced        -0.4181
Intercept       1.7078


Odds Ratios...
                 Class
Variable       control
======================
spontaneous      0.302
induced         0.6583

> ## Compare to R:
> glm(STATUS ~ spontaneous + induced, data = infert, family = binomial())

Call:  glm(formula = STATUS ~ spontaneous + induced, family = binomial(), 
    data = infert)

Coefficients:
(Intercept)  spontaneous      induced  
    -1.7079       1.1972       0.4181  

Degrees of Freedom: 247 Total (i.e. Null);  245 Residual
Null Deviance:	    316.2 
Residual Deviance: 279.6 	AIC: 285.6
> 
> ## Sequential minimal optimization algorithm for training a support
> ## vector classifier, using am RBF kernel with a non-default gamma
> ## parameter (argument '-G') instead of the default polynomial kernel
> ## (from a question on r-help):
> SMO(Species ~ ., data = iris,
+     control = Weka_control(K =
+     list("weka.classifiers.functions.supportVector.RBFKernel", G = 2)))
SMO

Kernel used:
  RBF kernel: K(x,y) = e^-(2.0* <x-y,x-y>^2)

Classifier for classes: setosa, versicolor

BinarySMO

 -       1      * < 0.222222 0.541667 0.118644 0.166667> * X]
 -       0.3237 * < 0.388889 1 0.084746 0.125> * X]
 +       1      * < 0.222222 0.208333 0.338983 0.416667> * X]
 -       1      * < 0.055556 0.125 0.050847 0.083333> * X]
 +       1      * < 0.166667 0.166667 0.389831 0.375> * X]
 -       0.2714 * < 0.194444 0.416667 0.101695 0.041667> * X]
 +       0.3878 * < 0.194444 0.125 0.389831 0.375> * X]
 +       0.5167 * < 0.75 0.5 0.627119 0.541667> * X]
 -       0.5173 * < 0.194444 0.625 0.101695 0.208333> * X]
 +       0.2078 * < 0.472222 0.583333 0.59322 0.625> * X]
 +       0.026 

Number of support vectors: 10

Number of kernel evaluations: 1716 (65.487% cached)

Classifier for classes: setosa, virginica

BinarySMO

      1      * < 0.166667 0.208333 0.59322 0.666667> * X]
 -       0.857  * < 0.388889 1 0.084746 0.125> * X]
 +       0.6508 * < 1 0.75 0.915254 0.791667> * X]
 -       1      * < 0.055556 0.125 0.050847 0.083333> * X]
 -       0.1459 * < 0.222222 0.541667 0.118644 0.166667> * X]
 +       0.1937 * < 0.472222 0.083333 0.677966 0.583333> * X]
 -       0.1384 * < 0.194444 0.625 0.101695 0.208333> * X]
 +       0.2969 * < 0.944444 0.25 1 0.916667> * X]
 +       0.0858

Number of support vectors: 8

Number of kernel evaluations: 2113 (70.062% cached)

Classifier for classes: versicolor, virginica

BinarySMO

      1      * < 0.555556 0.208333 0.677966 0.75> * X]
 -       1      * < 0.305556 0.416667 0.59322 0.583333> * X]
 -       1      * < 0.666667 0.458333 0.627119 0.583333> * X]
 -       1      * < 0.472222 0.583333 0.59322 0.625> * X]
 +       1      * < 0.444444 0.416667 0.694915 0.708333> * X]
 -       1      * < 0.527778 0.083333 0.59322 0.583333> * X]
 +       0.395  * < 1 0.75 0.915254 0.791667> * X]
 +       1      * < 0.416667 0.291667 0.694915 0.75> * X]
 -       1      * < 0.472222 0.291667 0.694915 0.625> * X]
 +       0.9843 * < 0.555556 0.375 0.779661 0.708333> * X]
 -       1      * < 0.666667 0.416667 0.677966 0.666667> * X]
 -       0.1282 * < 0.75 0.5 0.627119 0.541667> * X]
 +       1      * < 0.611111 0.416667 0.762712 0.708333> * X]
 -       1      * < 0.5 0.375 0.627119 0.541667> * X]
 -       1      * < 0.722222 0.458333 0.661017 0.583333> * X]
 +       1      * < 0.472222 0.083333 0.677966 0.583333> * X]
 +       1      * < 0.583333 0.458333 0.762712 0.708333> * X]
 +       1      * < 0.611111 0.5 0.694915 0.791667> * X]
 +       1      * < 0.5 0.416667 0.661017 0.708333> * X]
 -       1      * < 0.694444 0.333333 0.644068 0.541667> * X]
 +       1      * < 0.416667 0.291667 0.694915 0.75> * X]
 +       1      * < 0.527778 0.333333 0.644068 0.708333> * X]
 -       1      * < 0.444444 0.5 0.644068 0.708333> * X]
 +       1      * < 0.5 0.25 0.779661 0.541667> * X]
 +       0.3865 * < 0.805556 0.5 0.847458 0.708333> * X]
 +       1      * < 0.555556 0.291667 0.661017 0.708333> * X]
 +       0.0518 * < 0.361111 0.333333 0.661017 0.791667> * X]
 -       1      * < 0.555556 0.208333 0.661017 0.583333> * X]
 -       1      * < 0.555556 0.125 0.576271 0.5> * X]
 +       1      * < 0.555556 0.333333 0.694915 0.583333> * X]
 +       1      * < 0.166667 0.208333 0.59322 0.666667> * X]
 +       1      * < 0.805556 0.416667 0.813559 0.625> * X]
 -       1      * < 0.555556 0.541667 0.627119 0.625> * X]
 +       1      * < 0.472222 0.416667 0.644068 0.708333> * X]
 -       1      * < 0.361111 0.416667 0.59322 0.583333> * X]
 -       0.9071 * < 0.583333 0.5 0.59322 0.583333> * X]
 -       0.7824 * < 0.333333 0.125 0.508475 0.5> * X]
 -       1      * < 0.472222 0.375 0.59322 0.583333> * X]
 -       1      * < 0.611111 0.333333 0.610169 0.583333> * X]
 +       0.0763

Number of support vectors: 39

Number of kernel evaluations: 3721 (81.202% cached)


> ## In fact, by some hidden magic it also "works" to give the "base" name
> ## of the Weka kernel class:
> SMO(Species ~ ., data = iris,
+     control = Weka_control(K = list("RBFKernel", G = 2)))
SMO

Kernel used:
  RBF kernel: K(x,y) = e^-(2.0* <x-y,x-y>^2)

Classifier for classes: setosa, versicolor

BinarySMO

 -       1      * < 0.222222 0.541667 0.118644 0.166667> * X]
 -       0.3237 * < 0.388889 1 0.084746 0.125> * X]
 +       1      * < 0.222222 0.208333 0.338983 0.416667> * X]
 -       1      * < 0.055556 0.125 0.050847 0.083333> * X]
 +       1      * < 0.166667 0.166667 0.389831 0.375> * X]
 -       0.2714 * < 0.194444 0.416667 0.101695 0.041667> * X]
 +       0.3878 * < 0.194444 0.125 0.389831 0.375> * X]
 +       0.5167 * < 0.75 0.5 0.627119 0.541667> * X]
 -       0.5173 * < 0.194444 0.625 0.101695 0.208333> * X]
 +       0.2078 * < 0.472222 0.583333 0.59322 0.625> * X]
 +       0.026 

Number of support vectors: 10

Number of kernel evaluations: 1716 (65.487% cached)

Classifier for classes: setosa, virginica

BinarySMO

      1      * < 0.166667 0.208333 0.59322 0.666667> * X]
 -       0.857  * < 0.388889 1 0.084746 0.125> * X]
 +       0.6508 * < 1 0.75 0.915254 0.791667> * X]
 -       1      * < 0.055556 0.125 0.050847 0.083333> * X]
 -       0.1459 * < 0.222222 0.541667 0.118644 0.166667> * X]
 +       0.1937 * < 0.472222 0.083333 0.677966 0.583333> * X]
 -       0.1384 * < 0.194444 0.625 0.101695 0.208333> * X]
 +       0.2969 * < 0.944444 0.25 1 0.916667> * X]
 +       0.0858

Number of support vectors: 8

Number of kernel evaluations: 2113 (70.062% cached)

Classifier for classes: versicolor, virginica

BinarySMO

      1      * < 0.555556 0.208333 0.677966 0.75> * X]
 -       1      * < 0.305556 0.416667 0.59322 0.583333> * X]
 -       1      * < 0.666667 0.458333 0.627119 0.583333> * X]
 -       1      * < 0.472222 0.583333 0.59322 0.625> * X]
 +       1      * < 0.444444 0.416667 0.694915 0.708333> * X]
 -       1      * < 0.527778 0.083333 0.59322 0.583333> * X]
 +       0.395  * < 1 0.75 0.915254 0.791667> * X]
 +       1      * < 0.416667 0.291667 0.694915 0.75> * X]
 -       1      * < 0.472222 0.291667 0.694915 0.625> * X]
 +       0.9843 * < 0.555556 0.375 0.779661 0.708333> * X]
 -       1      * < 0.666667 0.416667 0.677966 0.666667> * X]
 -       0.1282 * < 0.75 0.5 0.627119 0.541667> * X]
 +       1      * < 0.611111 0.416667 0.762712 0.708333> * X]
 -       1      * < 0.5 0.375 0.627119 0.541667> * X]
 -       1      * < 0.722222 0.458333 0.661017 0.583333> * X]
 +       1      * < 0.472222 0.083333 0.677966 0.583333> * X]
 +       1      * < 0.583333 0.458333 0.762712 0.708333> * X]
 +       1      * < 0.611111 0.5 0.694915 0.791667> * X]
 +       1      * < 0.5 0.416667 0.661017 0.708333> * X]
 -       1      * < 0.694444 0.333333 0.644068 0.541667> * X]
 +       1      * < 0.416667 0.291667 0.694915 0.75> * X]
 +       1      * < 0.527778 0.333333 0.644068 0.708333> * X]
 -       1      * < 0.444444 0.5 0.644068 0.708333> * X]
 +       1      * < 0.5 0.25 0.779661 0.541667> * X]
 +       0.3865 * < 0.805556 0.5 0.847458 0.708333> * X]
 +       1      * < 0.555556 0.291667 0.661017 0.708333> * X]
 +       0.0518 * < 0.361111 0.333333 0.661017 0.791667> * X]
 -       1      * < 0.555556 0.208333 0.661017 0.583333> * X]
 -       1      * < 0.555556 0.125 0.576271 0.5> * X]
 +       1      * < 0.555556 0.333333 0.694915 0.583333> * X]
 +       1      * < 0.166667 0.208333 0.59322 0.666667> * X]
 +       1      * < 0.805556 0.416667 0.813559 0.625> * X]
 -       1      * < 0.555556 0.541667 0.627119 0.625> * X]
 +       1      * < 0.472222 0.416667 0.644068 0.708333> * X]
 -       1      * < 0.361111 0.416667 0.59322 0.583333> * X]
 -       0.9071 * < 0.583333 0.5 0.59322 0.583333> * X]
 -       0.7824 * < 0.333333 0.125 0.508475 0.5> * X]
 -       1      * < 0.472222 0.375 0.59322 0.583333> * X]
 -       1      * < 0.611111 0.333333 0.610169 0.583333> * X]
 +       0.0763

Number of support vectors: 39

Number of kernel evaluations: 3721 (81.202% cached)


> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>