Last data update: 2014.03.03
R: Normalization of read depth from whole exome sequencing under...
normalize2 R Documentation
Normalization of read depth from whole exome sequencing under the case-control
setting
Description
Fits a Poisson log-linear model that normalizes the read depth data from whole
exome sequencing. Includes terms that specifically remove biases due to GC
content, exon capture and amplification efficiency, and latent systemic
artifacts. If the WES is designed under case-control setting, CODEX estimates
the exon-wise Poisson latent factor using only the read depths in the control
cohort, and then computes the sample-wise latent factor terms for the case
samples by regression.
Usage
normalize2(Y_qc, gc_qc, K, normal_index)
Arguments
Y_qc
Read depth matrix after quality control procedure returned from
qc
gc_qc
Vector of GC content for each exon after quality control procedure returned
from qc
K
Number of latent Poisson factors. Can be an integer if optimal solution has
been chosen or a vector of integers so that AIC, BIC, and RSS are computed
for choice of optimal k.
normal_index
Indices of control samples.
Value
Yhat
Normalized read depth matrix
AIC
AIC for model selection
BIC
BIC for model selection
RSS
RSS for model selection
K
Number of latent Poisson factors
Author(s)
Yuchao Jiang yuchaoj@wharton.upenn.edu
See Also
qc
,
choiceofK
Examples
Y_qc <- qcObjDemo$Y_qc
gc_qc <- qcObjDemo$gc_qc
normObj <- normalize2(Y_qc, gc_qc, K = 1:5, normal_index = seq(1, 45, 2))
Yhat <- normObj$Yhat
AIC <- normObj$AIC
BIC <- normObj$BIC
RSS <- normObj$RSS
K <- normObj$K
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(CODEX)
Loading required package: Rsamtools
Loading required package: GenomeInfoDb
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: Biostrings
Loading required package: XVector
Loading required package: BSgenome.Hsapiens.UCSC.hg19
Loading required package: BSgenome
Loading required package: rtracklayer
Attaching package: 'CODEX'
The following object is masked from 'package:BiocGenerics':
normalize
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/CODEX/normalize2.Rd_%03d_medium.png", width=480, height=480)
> ### Name: normalize2
> ### Title: Normalization of read depth from whole exome sequencing under
> ### the case-control setting
> ### Aliases: normalize2
> ### Keywords: package
>
> ### ** Examples
>
> Y_qc <- qcObjDemo$Y_qc
> gc_qc <- qcObjDemo$gc_qc
> normObj <- normalize2(Y_qc, gc_qc, K = 1:5, normal_index = seq(1, 45, 2))
k = 1
Iteration 1 beta diff =1.18 f(GC) diff =0.000187
hhat diff =0.0012
hhat diff =2.53e-05
Iteration 2 beta diff =0.00618 f(GC) diff =1.38e-06
hhat diff =0.000475
hhat diff =2.33e-05
Iteration 3 beta diff =0.000672 f(GC) diff =4.84e-08
hhat diff =0.000189
hhat diff =1.02e-05
Iteration 4 beta diff =0.000179 f(GC) diff =9.39e-09
hhat diff =7.54e-05
Iteration 5 beta diff =8.15e-05 f(GC) diff =6.59e-09
hhat diff =2.79e-05
Stop at Iteration 5.
AIC1 = 4032024.269
BIC1 = 4031265.058
RSS1 = 728.648
k = 2
Iteration 1 beta diff =1.18 f(GC) diff =0.000187
hhat diff =0.0436
hhat diff =0.0435
hhat diff =2.41e-06
Iteration 2 beta diff =0.00624 f(GC) diff =4.93e-07
hhat diff =0.0435
hhat diff =2.7e-05
Iteration 3 beta diff =0.000903 f(GC) diff =5.01e-08
hhat diff =0.0434
hhat diff =8.99e-06
Iteration 4 beta diff =0.000213 f(GC) diff =1.96e-08
hhat diff =0.0433
hhat diff =2.88e-06
Iteration 5 beta diff =6.11e-05 f(GC) diff =1.3e-08
hhat diff =0.0433
hhat diff =8.29e-07
Stop at Iteration 5.
AIC2 = 4035981.863
BIC2 = 4034463.441
RSS2 = 111.379
k = 3
Iteration 1 beta diff =1.18 f(GC) diff =0.000187
hhat diff =0.0302
hhat diff =0.000191
hhat diff =5.99e-05
Iteration 2 beta diff =0.00771 f(GC) diff =3.78e-07
hhat diff =0.0301
hhat diff =3.98e-05
Iteration 3 beta diff =0.00041 f(GC) diff =6.61e-08
hhat diff =0.03
hhat diff =2.03e-05
Iteration 4 beta diff =0.000188 f(GC) diff =1.84e-08
hhat diff =0.03
hhat diff =2.12e-05
Iteration 5 beta diff =3.98e-05 f(GC) diff =1.1e-08
hhat diff =0.0301
hhat diff =2.44e-05
Stop at Iteration 5.
AIC3 = 4035819.769
BIC3 = 4033542.136
RSS3 = 106.405
k = 4
Iteration 1 beta diff =1.18 f(GC) diff =0.000187
hhat diff =0.0463
hhat diff =0.000227
hhat diff =0.0651
hhat diff =3.18e-05
Iteration 2 beta diff =0.00725 f(GC) diff =3.15e-07
hhat diff =0.0578
hhat diff =0.0219
hhat diff =3.21e-05
Iteration 3 beta diff =0.000519 f(GC) diff =4.3e-08
hhat diff =0.0618
hhat diff =0.0219
hhat diff =2.41e-05
Iteration 4 beta diff =0.000218 f(GC) diff =1.61e-08
hhat diff =0.063
hhat diff =0.0219
hhat diff =1.64e-05
Iteration 5 beta diff =0.000103 f(GC) diff =9.12e-09
hhat diff =0.0413
hhat diff =0.000147
hhat diff =1.47e-05
Stop at Iteration 5.
AIC4 = 4035736.029
BIC4 = 4032699.185
RSS4 = 100.512
k = 5
Iteration 1 beta diff =1.18 f(GC) diff =0.000187
hhat diff =0.0511
hhat diff =0.0178
hhat diff =0.0347
hhat diff =7.38e-05
Iteration 2 beta diff =0.00731 f(GC) diff =2.93e-07
hhat diff =0.00615
hhat diff =0.0176
hhat diff =0.000144
hhat diff =0.0347
hhat diff =3.59e-05
Iteration 3 beta diff =0.000528 f(GC) diff =4.95e-08
hhat diff =0.00385
hhat diff =0.000532
hhat diff =0.0175
hhat diff =6.15e-05
Iteration 4 beta diff =0.000244 f(GC) diff =2.39e-08
hhat diff =0.0171
hhat diff =0.00055
hhat diff =0.0175
hhat diff =5.41e-05
Iteration 5 beta diff =0.000122 f(GC) diff =1.2e-08
hhat diff =0.0179
hhat diff =0.000576
hhat diff =0.0175
hhat diff =4.87e-05
Iteration 6 beta diff =5.34e-05 f(GC) diff =7.4e-09
hhat diff =0.0337
hhat diff =0.0351
hhat diff =0.000118
hhat diff =4.13e-05
Stop at Iteration 6.
AIC5 = 4035662.714
BIC5 = 4031866.659
RSS5 = 94.4
> Yhat <- normObj$Yhat
> AIC <- normObj$AIC
> BIC <- normObj$BIC
> RSS <- normObj$RSS
> K <- normObj$K
>
>
>
>
>
> dev.off()
null device
1
>