Last data update: 2014.03.03

R: Batch effects correction
batch.neutralizeR Documentation

Batch effects correction

Description

Computes the SpC matrix where the fixed effects of a blocking factor are substracted.

Usage

batch.neutralize(dat, fbatch, half=TRUE, sqrt.trans=TRUE)

Arguments

dat

A SpC matrix with proteins in the rows and samples in the columns.

fbatch

A blocking factor of length equal to the number of columns in the expression matrix.

half

When FALSE, the contrast coefficients are of the contr.treatment style. When TRUE, the contrast coefficients are of the contr.sum style, its aim is to distribute equally the effect to each batch level, instead of having untouched reference levels.

sqrt.trans

When TRUE the fit is done on the square root transformed SpC matrix.

Details

A model with intercept and the blocking factor is fitted. The batch effects corrected SpC matrix is computed by substracting the estimated effect of the given blocking factor. When there is no clear reference batch level, the default option half=TRUE should be preferred. The square root transformation is known to stabilize the variance of Poisson distributed counts (with variance equal to the mean). The linear model fitting gives more accurate errors and p-values on the square root transformed SpC matrix. Nevertheless with exploratory data analysis purposes, both the raw and square root transformed SpC matrix may give good results.

Value

The batch effects corrected SpC matrix.

Author(s)

Josep Gregori

See Also

The MSnSet class documentation and normalize

Examples

data(msms.dataset)
msnset <- pp.msms.data(msms.dataset)
###  Plot the PCA on the two first PC, and colour by treatment level
ftreat <- pData(msnset)$treat
counts.pca(msnset, facs=ftreat, do.plot=TRUE, snms=as.character(ftreat))
###  Correct the batch effects
spcm <- exprs(msnset)
fbatch <- pData(msnset)$batch
spcm2 <- batch.neutralize(spcm, fbatch, half=TRUE, sqrt.trans=TRUE)
###  Plot the PCA on the two first PC, and colour by treatment level
###  to visualize the improvement.
exprs(msnset) <- spcm2
counts.pca(msnset, facs=ftreat, do.plot=TRUE, snms=as.character(ftreat))
###  Incidence of the correction
summary(as.vector(spcm-spcm2))
plot(density(as.vector(spcm-spcm2)))

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(msmsEDA)
Loading required package: MSnbase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: mzR
Loading required package: Rcpp
Loading required package: BiocParallel
Loading required package: ProtGenerics

This is MSnbase version 1.20.7 
  Read '?MSnbase' and references therein for information
  about the package and how to get started.


Attaching package: 'MSnbase'

The following object is masked from 'package:stats':

    smooth

The following object is masked from 'package:base':

    trimws

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/msmsEDA/batch.neutralize.Rd_%03d_medium.png", width=480, height=480)
> ### Name: batch.neutralize
> ### Title: Batch effects correction
> ### Aliases: batch.neutralize
> ### Keywords: manip
> 
> ### ** Examples
> 
> data(msms.dataset)
> msnset <- pp.msms.data(msms.dataset)
> ###  Plot the PCA on the two first PC, and colour by treatment level
> ftreat <- pData(msnset)$treat
> counts.pca(msnset, facs=ftreat, do.plot=TRUE, snms=as.character(ftreat))
> ###  Correct the batch effects
> spcm <- exprs(msnset)
> fbatch <- pData(msnset)$batch
> spcm2 <- batch.neutralize(spcm, fbatch, half=TRUE, sqrt.trans=TRUE)
> ###  Plot the PCA on the two first PC, and colour by treatment level
> ###  to visualize the improvement.
> exprs(msnset) <- spcm2
> counts.pca(msnset, facs=ftreat, do.plot=TRUE, snms=as.character(ftreat))
> ###  Incidence of the correction
> summary(as.vector(spcm-spcm2))
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
-115.60000   -0.47570   -0.00694    0.21000    0.88060  137.90000 
> plot(density(as.vector(spcm-spcm2)))
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>