This function adjusts log2ratio by GC content using LOESS.
Usage
GC.adjust(data, gc, maxNumDataPoints = 10000)
Arguments
data
A data frame generated by cnv.data
or snp.cnv.data.
gc
A data frame containing three columns: chr, position
and GC. See the example data below for details.
maxNumDataPoints
The maximum number of data points used for loess fit.
Default is 10000.
Details
The method for GC content adjustment was adopted from CNAnorm (Gusnato et al. 2012).
Value
A data frame containing the log2ratio (GC adjusted) and log2mBAF values
for each probe site in the same format as generated by cnv.data
or snp.cnv.data. The original log2ratio is renamed as
log2ratio.woGCAdj. The GC-adjusted log2ratio is nameed as log2ratio.
Note
This function is optional in the analysis pipeline and is now in beta version.
Author(s)
Zhongyang Zhang <zhongyang.zhang@mssm.edu>
References
Gusnanto, A, Wood HM, Pawitan Y, Rabbitts P, Berri S (2012) Correcting for
cancer genome size and tumour cell content enables better estimation of
copy number alterations from next-generation sequence data.
Bioinformatics, 28:40-47.
See Also
cnv.data, snp.cnv.data
Examples
## CNV data generated by cnv.data
data(seq.data)
head(seq.data)
## Not run:
## an example GC content file
url <- "https://zhangz05.u.hpc.mssm.edu/saasCNV/data/GC_1kb_hg19.txt.gz"
tryCatch({download.file(url=url, destfile="GC_1kb_hg19.txt.gz")
}, error = function(e) {
download.file(url=url, destfile="GC_1kb_hg19.txt.gz", method="curl")
})
## If download.file fails to download the data, please manually download it from the url.
gc <- read.delim(file = "GC_1kb_hg19.txt.gz", as.is=TRUE)
head(gc)
## GC content adjustment
seq.data <- GC.adjust(data = seq.data, gc = gc, maxNumDataPoints = 10000)
head(seq.data)
## End(Not run)