R: Subset-quantile Within Array Normalisation for Illumina...
SWAN
R Documentation
Subset-quantile Within Array Normalisation for Illumina Infinium HumanMethylation450 BeadChips
Description
Subset-quantile Within Array Normalisation (SWAN) is a within array normalisation method for the Illumina Infinium HumanMethylation450 platform. It allows Infinium I and II type probes on a single array to be normalized together.
Usage
SWAN(data, verbose = FALSE)
Arguments
data
An object of class either MethylSet, RGChannelSet or MethyLumiSet.
verbose
Should the function be verbose?
Details
The SWAN method has two parts. First, an average quantile distribution is created using a subset of probes defined to be biologically similar based on the number of CpGs underlying the probe body. This is achieved by randomly selecting N Infinium I and II probes that have 1, 2 and 3 underlying CpGs, where N is the minimum number of probes in the 6 sets of Infinium I and II probes with 1, 2 or 3 probe body CpGs. If no probes have previously been filtered out e.g. sex chromosome probes, etc. N=11,303. This results in a pool of 3N Infinium I and 3N Infinium II probes. The subset for each probe type is then sorted by increasing intensity. The value of each of the 3N pairs of observations is subsequently assigned to be the mean intensity of the two probe types for that row or 'quantile'. This is the standard quantile procedure. The intensities of the remaining probes are then separately adjusted for each probe type using linear interpolation between the subset probes.
Value
An object of class MethylSet
Note
SWAN uses a random subset of probes to perform the within-array normalization. In order to achive reproducible results, the seed needs to be set using set.seed.
J Maksimovic, L Gordon and A Oshlack (2012). SWAN: Subset quantile Within-Array Normalization for Illumina Infinium HumanMethylation450 BeadChips. Genome Biology 13, R44.
See Also
RGChannelSet and
MethylSet as well as
MethyLumiSet and
IlluminaMethylationManifest.
Examples
if (require(minfi) & require(minfiData)) {
set.seed(100)
datSwan1 <- SWAN(RGsetEx)
dat <- preprocessRaw(RGsetEx)
set.seed(100)
datSwan2 <- SWAN(dat)
head(getMeth(datSwan2)) == head(getMeth(datSwan1))
}
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(missMethyl)
Setting options('download.file.method.GEOquery'='auto')
Setting options('GEOquery.inmemory.gpl'=FALSE)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/missMethyl/SWAN.Rd_%03d_medium.png", width=480, height=480)
> ### Name: SWAN
> ### Title: Subset-quantile Within Array Normalisation for Illumina Infinium
> ### HumanMethylation450 BeadChips
> ### Aliases: SWAN SWAN.default SWAN.MethyLumiSet SWAN.RGChannelSet
>
> ### ** Examples
>
> if (require(minfi) & require(minfiData)) {
+
+ set.seed(100)
+ datSwan1 <- SWAN(RGsetEx)
+
+ dat <- preprocessRaw(RGsetEx)
+ set.seed(100)
+ datSwan2 <- SWAN(dat)
+
+ head(getMeth(datSwan2)) == head(getMeth(datSwan1))
+ }
Loading required package: minfi
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: lattice
Loading required package: GenomicRanges
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
Loading required package: Biostrings
Loading required package: XVector
Loading required package: bumphunter
Loading required package: foreach
Loading required package: iterators
Loading required package: locfit
locfit 1.5-9.1 2013-03-22
Loading required package: minfiData
Loading required package: IlluminaHumanMethylation450kmanifest
Loading required package: IlluminaHumanMethylation450kanno.ilmn12.hg19
[SWAN] RGChannelSet -> MethylSet
[SWAN] Preparing normalization subset
[SWAN] Normalizing methylated channel
[SWAN] Normalizing unmethylated channel
[SWAN] Preparing normalization subset
[SWAN] Normalizing methylated channel
[SWAN] Normalizing unmethylated channel
5723646052_R02C02 5723646052_R04C01 5723646052_R05C02
cg00050873 TRUE TRUE TRUE
cg00212031 TRUE TRUE TRUE
cg00213748 TRUE TRUE TRUE
cg00214611 TRUE TRUE TRUE
cg00455876 TRUE TRUE TRUE
cg01707559 TRUE TRUE TRUE
5723646053_R04C02 5723646053_R05C02 5723646053_R06C02
cg00050873 TRUE TRUE TRUE
cg00212031 TRUE TRUE TRUE
cg00213748 TRUE TRUE TRUE
cg00214611 TRUE TRUE TRUE
cg00455876 TRUE TRUE TRUE
cg01707559 TRUE TRUE TRUE
>
>
>
>
>
>
> dev.off()
null device
1
>