Last data update: 2014.03.03

R: Remove unwanted variation when testing for differential...
RUVfitR Documentation

Remove unwanted variation when testing for differential methylation

Description

Provides an interface similar to lmFit from limma to the RUV2, RUV4, RUVinv and RUVrinv functions from the ruv package, which facilitates the removal of unwanted variation in a differential methylation analysis. A set of negative control variables, as described in the references, must be specified.

Usage

RUVfit(data, design, coef, ctl, method=c("inv", "rinv", "ruv4", "ruv2"), 
k = NULL, ...)

Arguments

data

numeric matrix with rows corresponding to the features of interest such as CpG sites and columns corresponding to samples or arrays.

design

the design matrix of the experiment, with rows corresponding to arrays/samples and columns to coefficients to be estimated.

coef

integer, column of the design matrix containing the comparison to test for differential methylation. Default is the last colum of the design matrix.

ctl

logical vector, length == nrow(data). Features that are to be used as negative control variables are indicated as TRUE, all other features are FALSE.

method

character string, indicates which RUV method should be used. Default method is RUVinv.

k

integer, required if method is "ruv2" or "ruv4". Indicates the number of unwanted factors to use. Can be 0.

...

additional arguments that can be passed to RUV2, RUV4, RUVinv and RUVrinv. See linked function documentation for details.

Details

This function depends on the ruv and limma packages and is used to estimate and adjust for unwanted variation in a differential methylation analysis. Briefly, the unwanted factors W are estimated using negative control variables. Y is then regressed on the variables X, Z, and W. For methylation data, the analysis is performed on the M-values, defined as the log base 2 ratio of the methylated signal to the unmethylated signal.

Value

An object of class MArrayLM (see MArrayLM-class) containing:

coefficients

The estimated coefficients of the factor(s) of interest.

sigma2

Estimates of the features' variances.

t

t statistics for the factor(s) of interest.

p

P-values for the factor(s) of interest.

multiplier

The constant by which sigma2 must be multiplied in order to get an estimate of the variance of coefficients

df

The number of residual degrees of freedom.

W

The estimated unwanted factors.

alpha

The estimated coefficients of W.

byx

The coefficients in a regression of Y on X (after both Y and X have been "adjusted" for Z). Useful for projection plots.

bwx

The coefficients in a regression of W on X (after X has been "adjusted" for Z). Useful for projection plots.

X

X. Included for reference.

k

k. Included for reference.

ctl

ctl. Included for reference.

Z

Z. Included for reference.

fullW0

Can be used to speed up future calls of RUVfit.

Author(s)

Jovana Maksimovic jovana.maksimovic@mcri.edu.au

References

Gagnon-Bartsch JA, Speed TP. (2012). Using control genes to correct for unwanted variation in microarray data. Biostatistics. 13(3), 539-52. Available at: http://biostatistics.oxfordjournals.org/content/13/3/539.full.

Gagnon-Bartsch, Jacob, and Speed. 2013. Removing Unwanted Variation from High Dimensional Data with Negative Controls. Available at: http://statistics.berkeley.edu/tech-reports/820.

See Also

RUV2, RUV4, RUVinv, RUVrinv, topRUV

Examples

if(require(minfi) & require(minfiData) & require(limma)) {

# Get methylation data for a 2 group comparison
meth <- getMeth(MsetEx)
unmeth <- getUnmeth(MsetEx)
Mval <- log2((meth + 100)/(unmeth + 100))

group<-factor(pData(MsetEx)$Sample_Group)
design<-model.matrix(~group)

# Perform initial analysis to empirically identify negative control features 
# when not known a priori
lFit = lmFit(Mval,design)
lFit2 = eBayes(lFit)
lTop = topTable(lFit2,coef=2,num=Inf)

# The negative control features should *not* be associated with factor of interest
# but *should* be affected by unwanted variation 
ctl = rownames(Mval) %in% rownames(lTop[lTop$adj.P.Val > 0.5,])

# Perform RUV adjustment and fit
fit = RUVfit(data=Mval, design=design, coef=2, ctl=ctl)
fit2 = RUVadj(fit)

# Look at table of top results
top = topRUV(fit2)
}

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(missMethyl)
Setting options('download.file.method.GEOquery'='auto')
Setting options('GEOquery.inmemory.gpl'=FALSE)

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/missMethyl/RUVfit.Rd_%03d_medium.png", width=480, height=480)
> ### Name: RUVfit
> ### Title: Remove unwanted variation when testing for differential
> ###   methylation
> ### Aliases: RUVfit
> 
> ### ** Examples
> 
> if(require(minfi) & require(minfiData) & require(limma)) {
+ 
+ # Get methylation data for a 2 group comparison
+ meth <- getMeth(MsetEx)
+ unmeth <- getUnmeth(MsetEx)
+ Mval <- log2((meth + 100)/(unmeth + 100))
+ 
+ group<-factor(pData(MsetEx)$Sample_Group)
+ design<-model.matrix(~group)
+ 
+ # Perform initial analysis to empirically identify negative control features 
+ # when not known a priori
+ lFit = lmFit(Mval,design)
+ lFit2 = eBayes(lFit)
+ lTop = topTable(lFit2,coef=2,num=Inf)
+ 
+ # The negative control features should *not* be associated with factor of interest
+ # but *should* be affected by unwanted variation 
+ ctl = rownames(Mval) %in% rownames(lTop[lTop$adj.P.Val > 0.5,])
+ 
+ # Perform RUV adjustment and fit
+ fit = RUVfit(data=Mval, design=design, coef=2, ctl=ctl)
+ fit2 = RUVadj(fit)
+ 
+ # Look at table of top results
+ top = topRUV(fit2)
+ }
Loading required package: minfi
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: lattice
Loading required package: GenomicRanges
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
Loading required package: Biostrings
Loading required package: XVector
Loading required package: bumphunter
Loading required package: foreach
Loading required package: iterators
Loading required package: locfit
locfit 1.5-9.1 	 2013-03-22
Loading required package: minfiData
Loading required package: IlluminaHumanMethylation450kmanifest
Loading required package: IlluminaHumanMethylation450kanno.ilmn12.hg19
Loading required package: limma

Attaching package: 'limma'

The following object is masked from 'package:BiocGenerics':

    plotMA

> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>