R: Remove unwanted variation when testing for differential...
RUVfit
R Documentation
Remove unwanted variation when testing for differential methylation
Description
Provides an interface similar to lmFit from limma to
the RUV2, RUV4, RUVinv and
RUVrinv functions from the ruv package, which
facilitates the removal of unwanted variation in a differential methylation
analysis. A set of negative control variables, as described in the references,
must be specified.
numeric matrix with rows corresponding to the features of interest such
as CpG sites and columns corresponding to samples or arrays.
design
the design matrix of the experiment, with rows corresponding to arrays/samples
and columns to coefficients to be estimated.
coef
integer, column of the design matrix containing the comparison to test for
differential methylation. Default is the last colum of the design matrix.
ctl
logical vector, length == nrow(data). Features that are to be used as
negative control variables are indicated as TRUE, all other features are FALSE.
method
character string, indicates which RUV method should be used. Default method is
RUVinv.
k
integer, required if method is "ruv2" or "ruv4". Indicates the number of
unwanted factors to use. Can be 0.
...
additional arguments that can be passed to RUV2, RUV4,
RUVinv and RUVrinv. See linked function documentation
for details.
Details
This function depends on the ruv and limma packages and is used to
estimate and adjust for unwanted variation in a differential methylation analysis.
Briefly, the unwanted factors W are estimated using negative control
variables. Y is then regressed on the variables X, Z,
and W. For methylation data, the analysis is performed on the M-values,
defined as the log base 2 ratio of the methylated signal to the unmethylated
signal.
Value
An object of class MArrayLM (see MArrayLM-class) containing:
coefficients
The estimated coefficients of the factor(s) of interest.
sigma2
Estimates of the features' variances.
t
t statistics for the factor(s) of interest.
p
P-values for the factor(s) of interest.
multiplier
The constant by which sigma2 must be multiplied in order
to get an estimate of the variance of coefficients
df
The number of residual degrees of freedom.
W
The estimated unwanted factors.
alpha
The estimated coefficients of W.
byx
The coefficients in a regression of Y on X (after both Y and X have
been "adjusted" for Z). Useful for projection plots.
bwx
The coefficients in a regression of W on X (after X has been
"adjusted" for Z). Useful for projection plots.
if(require(minfi) & require(minfiData) & require(limma)) {
# Get methylation data for a 2 group comparison
meth <- getMeth(MsetEx)
unmeth <- getUnmeth(MsetEx)
Mval <- log2((meth + 100)/(unmeth + 100))
group<-factor(pData(MsetEx)$Sample_Group)
design<-model.matrix(~group)
# Perform initial analysis to empirically identify negative control features
# when not known a priori
lFit = lmFit(Mval,design)
lFit2 = eBayes(lFit)
lTop = topTable(lFit2,coef=2,num=Inf)
# The negative control features should *not* be associated with factor of interest
# but *should* be affected by unwanted variation
ctl = rownames(Mval) %in% rownames(lTop[lTop$adj.P.Val > 0.5,])
# Perform RUV adjustment and fit
fit = RUVfit(data=Mval, design=design, coef=2, ctl=ctl)
fit2 = RUVadj(fit)
# Look at table of top results
top = topRUV(fit2)
}
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(missMethyl)
Setting options('download.file.method.GEOquery'='auto')
Setting options('GEOquery.inmemory.gpl'=FALSE)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/missMethyl/RUVfit.Rd_%03d_medium.png", width=480, height=480)
> ### Name: RUVfit
> ### Title: Remove unwanted variation when testing for differential
> ### methylation
> ### Aliases: RUVfit
>
> ### ** Examples
>
> if(require(minfi) & require(minfiData) & require(limma)) {
+
+ # Get methylation data for a 2 group comparison
+ meth <- getMeth(MsetEx)
+ unmeth <- getUnmeth(MsetEx)
+ Mval <- log2((meth + 100)/(unmeth + 100))
+
+ group<-factor(pData(MsetEx)$Sample_Group)
+ design<-model.matrix(~group)
+
+ # Perform initial analysis to empirically identify negative control features
+ # when not known a priori
+ lFit = lmFit(Mval,design)
+ lFit2 = eBayes(lFit)
+ lTop = topTable(lFit2,coef=2,num=Inf)
+
+ # The negative control features should *not* be associated with factor of interest
+ # but *should* be affected by unwanted variation
+ ctl = rownames(Mval) %in% rownames(lTop[lTop$adj.P.Val > 0.5,])
+
+ # Perform RUV adjustment and fit
+ fit = RUVfit(data=Mval, design=design, coef=2, ctl=ctl)
+ fit2 = RUVadj(fit)
+
+ # Look at table of top results
+ top = topRUV(fit2)
+ }
Loading required package: minfi
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: lattice
Loading required package: GenomicRanges
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
Loading required package: Biostrings
Loading required package: XVector
Loading required package: bumphunter
Loading required package: foreach
Loading required package: iterators
Loading required package: locfit
locfit 1.5-9.1 2013-03-22
Loading required package: minfiData
Loading required package: IlluminaHumanMethylation450kmanifest
Loading required package: IlluminaHumanMethylation450kanno.ilmn12.hg19
Loading required package: limma
Attaching package: 'limma'
The following object is masked from 'package:BiocGenerics':
plotMA
>
>
>
>
>
> dev.off()
null device
1
>