Last data update: 2014.03.03

R: Balances between negative and positive samples by...
compute.balancedR Documentation

Balances between negative and positive samples by oversampling.

Description

If negative samples are less than positive ones, more copies of the negative cases are added and vice versa.

Usage

compute.balanced(F_, L_)

Arguments

F_

The feature matrix, each column is a feature.

L_

The vector of labels named according to the rows of F.

Details

Considerably unbalanced classes may be probabilistic for fitting some models.

Value

Returns a list of:

F_

The feature matrix, each column is a feature.

L_

The vector of labels named according to the rows of F.

Author(s)

Habil Zare

References

"Statistical Analysis of Overfitting Features", manuscript in preparation.

See Also

FeaLect, train.doctor, doctor.validate, random.subset, compute.balanced,compute.logistic.score, ignore.redundant, input.check.FeaLect

Examples

library(FeaLect)
data(mcl_sll)
F <- as.matrix(mcl_sll[ ,-1])	# The Feature matrix
L <- as.numeric(mcl_sll[ ,1])	# The labels
names(L) <- rownames(F)
message(L)

balanced <- compute.balanced(F_=F, L_=L)
message(balanced$L_)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(FeaLect)
Loading required package: lars
Loaded lars 1.2

Loading required package: rms
Loading required package: Hmisc
Loading required package: lattice
Loading required package: survival
Loading required package: Formula
Loading required package: ggplot2

Attaching package: 'Hmisc'

The following objects are masked from 'package:base':

    format.pval, round.POSIXt, trunc.POSIXt, units

Loading required package: SparseM

Attaching package: 'SparseM'

The following object is masked from 'package:base':

    backsolve

> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/FeaLect/compute.balanced.Rd_%03d_medium.png", width=480, height=480)
> ### Name: compute.balanced
> ### Title: Balances between negative and positive samples by oversampling.
> ### Aliases: compute.balanced
> ### Keywords: regression multivariate classif models
> 
> ### ** Examples
> 
> library(FeaLect)
> data(mcl_sll)
> F <- as.matrix(mcl_sll[ ,-1])	# The Feature matrix
> L <- as.numeric(mcl_sll[ ,1])	# The labels
> names(L) <- rownames(F)
> message(L)
1111111000000000000000
> 
> balanced <- compute.balanced(F_=F, L_=L)
> message(balanced$L_)
111111100000000000000011111111
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>