R: Detection of subclonal SNVs in deep sequencing experiments
deepSNV-package
R Documentation
Detection of subclonal SNVs in deep sequencing experiments
Description
Detection of subclonal SNVs in deep sequencing experiments
Details
This packages provides algorithms for detecting subclonal single nucleotide variants (SNVs) and their frequencies from ultra-deep sequencing data.
It retrieves the nucleotide counts at each position and each strand from two .bam files and tests for differences between the two experiments with a likelihood ratio test using
either a binomial or and overdispersed beta-binomial model.
The statistic can be tuned across genomic sites by a shared Dirichlet prior and there package provides procedures for normalizing sequencing data from different runs.
Gerstung M, Beisel C, Rechsteiner M, Wild P, Schraml P, Moch H, and Beerenwinkel N. Reliable detection of subclonal single-nucleotide variants in tumour cell populations. Nat Commun 3:811 (2012). DOI:10.1038/ncomms1814.
See Also
deepSNV
Examples
## Short example with 2 SNVs at frequency ~10%
regions <- data.frame(chr="B.FR.83.HXB2_LAI_IIIB_BRU_K034", start = 3120, stop=3140)
ex <- deepSNV(test = system.file("extdata", "test.bam", package="deepSNV"), control = system.file("extdata", "control.bam", package="deepSNV"), regions=regions, q=10)
show(ex) # show method
plot(ex) # scatter plot
summary(ex) # summary with significant SNVs
ex[1:3,] # subsetting the first three genomic positions
tail(test(ex, total=TRUE)) # retrieve the test counts on both strands
tail(control(ex, total=TRUE))
## Not run: Full example with ~ 100 SNVs. Requires an internet connection, but try yourself.
# regions <- data.frame(chr="B.FR.83.HXB2_LAI_IIIB_BRU_K034", start = 2074, stop=3585)
# HIVmix <- deepSNV(test = "http://www.bsse.ethz.ch/cbg/software/deepSNV/data/test.bam", control = "http://www.bsse.ethz.ch/cbg/software/deepSNV/data/control.bam", regions=regions, q=10)
data(HIVmix) # attach data instead..
show(HIVmix)
plot(HIVmix)
head(summary(HIVmix))
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(deepSNV)
Loading required package: parallel
Loading required package: Rhtslib
Rhtslib htslib version 1.1
Loading required package: IRanges
Loading required package: BiocGenerics
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: Biostrings
Loading required package: XVector
Loading required package: VGAM
Loading required package: splines
Loading required package: VariantAnnotation
Loading required package: Rsamtools
Attaching package: 'VariantAnnotation'
The following object is masked from 'package:base':
tabulate
Attaching package: 'deepSNV'
The following objects are masked from 'package:VGAM':
dbetabinom, pbetabinom
The following object is masked from 'package:BiocGenerics':
normalize
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/deepSNV/deepSNV-package.Rd_%03d_medium.png", width=480, height=480)
> ### Name: deepSNV-package
> ### Title: Detection of subclonal SNVs in deep sequencing experiments
> ### Aliases: deepSNV-package
> ### Keywords: package
>
> ### ** Examples
>
> ## Short example with 2 SNVs at frequency ~10%
> regions <- data.frame(chr="B.FR.83.HXB2_LAI_IIIB_BRU_K034", start = 3120, stop=3140)
> ex <- deepSNV(test = system.file("extdata", "test.bam", package="deepSNV"), control = system.file("extdata", "control.bam", package="deepSNV"), regions=regions, q=10)
> show(ex) # show method
Data: 21 positions x 10 characters
Model: bin
Alternative: greater
Combine Method: fisher
P-Values:
A T C G -
[1,] 0.5965736 NA 0.59657359 0.5965736 0.5965736
[2,] 0.4378589 8.465736e-01 NA 0.5965736 0.5965736
[3,] 0.5965736 NA 0.07559581 0.8465736 0.5965736
[4,] 0.3962578 5.965736e-01 0.59657359 NA 0.5965736
[5,] NA 4.369021e-01 0.59657359 0.8465736 0.5965736
[6,] 0.8177195 6.404014e-39 NA 0.5965736 0.5965736
...
A T C G -
[16,] 0.47970559 0.5965736 0.84657359 NA 0.5965736
[17,] 0.05350392 0.5965736 0.84657359 NA 0.5965736
[18,] 1.00000000 0.8465736 1.00000000 NA 0.5965736
[19,] 0.24915660 0.8465736 NA 1.011605e-01 0.5965736
[20,] NA 0.4422493 0.07253975 4.351952e-02 0.5965736
[21,] NA 0.8465736 0.84657359 1.747374e-45 0.5965736
> plot(ex) # scatter plot
> summary(ex) # summary with significant SNVs
chr pos ref var p.val freq.var
1 B.FR.83.HXB2_LAI_IIIB_BRU_K034 3125 C T 5.379372e-37 0.03828036
2 B.FR.83.HXB2_LAI_IIIB_BRU_K034 3140 A G 1.467794e-43 0.04875622
sigma2.freq.var n.tst.fw cov.tst.fw n.tst.bw cov.tst.bw n.ctrl.fw cov.ctrl.fw
1 1.678502e-05 58 1461 32 862 1 3066
2 2.425683e-05 60 1346 38 664 0 2775
n.ctrl.bw cov.ctrl.bw raw.p.val
1 1 1257 6.404014e-39
2 0 986 1.747374e-45
> ex[1:3,] # subsetting the first three genomic positions
Data: 3 positions x 10 characters
Model: bin
Alternative: greater
Combine Method: fisher
P-Values:
A T C G -
[1,] 0.5965736 NA 0.59657359 0.5965736 0.5965736
[2,] 0.4378589 0.8465736 NA 0.5965736 0.5965736
[3,] 0.5965736 NA 0.07559581 0.8465736 0.5965736
> tail(test(ex, total=TRUE)) # retrieve the test counts on both strands
A T C G -
[16,] 3 0 0 2172 0
[17,] 4 0 0 2140 0
[18,] 0 0 1 2116 6
[19,] 6 0 2073 10 0
[20,] 2072 1 8 3 0
[21,] 1911 0 1 98 0
> tail(control(ex, total=TRUE))
A T C G -
[16,] 2 0 1 4043 0
[17,] 0 0 1 3998 0
[18,] 3 1 4 3945 8
[19,] 5 1 3908 8 0
[20,] 3897 0 5 0 0
[21,] 3757 1 3 0 0
>
> ## Not run: Full example with ~ 100 SNVs. Requires an internet connection, but try yourself.
> # regions <- data.frame(chr="B.FR.83.HXB2_LAI_IIIB_BRU_K034", start = 2074, stop=3585)
> # HIVmix <- deepSNV(test = "http://www.bsse.ethz.ch/cbg/software/deepSNV/data/test.bam", control = "http://www.bsse.ethz.ch/cbg/software/deepSNV/data/control.bam", regions=regions, q=10)
> data(HIVmix) # attach data instead..
> show(HIVmix)
Data: 1512 positions x 10 characters
Model: bin
Alternative: greater
Combine Method: fisher
P-Values:
A T C G -
[1,] NA 0.5965736 0.5965736 0.5965736 0.5965736
[2,] 0.5965736 0.5965736 0.5965736 NA 0.5965736
[3,] NA 0.5965736 0.5965736 0.5965736 0.5965736
[4,] 0.5965736 0.5965736 NA 0.5965736 0.5965736
[5,] NA 0.5965736 0.5965736 0.5965736 0.5965736
[6,] 0.5965736 0.5965736 0.5965736 NA 0.5965736
...
A T C G -
[1507,] NA 0.5965736 0.5965736 0.5965736 0.5965736
[1508,] 0.8465736 0.5965736 0.8465736 NA 0.5965736
[1509,] 0.5965736 0.5965736 NA 0.5965736 0.5965736
[1510,] 0.8465736 0.5965736 NA 0.5965736 0.5965736
[1511,] NA 0.5965736 0.5965736 0.4737885 0.5965736
[1512,] 1.0000000 NA 0.8465736 0.8465736 0.8465736
> plot(HIVmix)
> head(summary(HIVmix))
chr pos ref var p.val freq.var
1 B.FR.83.HXB2_LAI_IIIB_BRU_K034 2127 G A 3.814024e-05 0.02903526
51 B.FR.83.HXB2_LAI_IIIB_BRU_K034 2130 T C 1.423636e-07 0.03076923
70 B.FR.83.HXB2_LAI_IIIB_BRU_K034 2139 A G 5.815022e-09 0.03362573
71 B.FR.83.HXB2_LAI_IIIB_BRU_K034 2141 A G 4.271206e-09 0.03333333
52 B.FR.83.HXB2_LAI_IIIB_BRU_K034 2150 A C 9.543121e-04 0.01763908
2 B.FR.83.HXB2_LAI_IIIB_BRU_K034 2151 G A 4.527563e-08 0.02815013
sigma2.freq.var n.tst.fw cov.tst.fw n.tst.bw cov.tst.bw n.ctrl.fw
1 4.892000e-05 17 581 2 47 2
51 4.733728e-05 16 597 4 53 0
70 4.916043e-05 16 599 7 85 0
71 4.830918e-05 16 599 7 91 0
52 2.393362e-05 10 609 3 128 0
2 3.773476e-05 16 614 5 132 0
cov.ctrl.fw n.ctrl.bw cov.ctrl.bw raw.p.val
1 1537 0 103 6.306257e-09
51 1534 0 104 2.353895e-11
70 1535 0 158 9.614784e-13
71 1535 0 181 7.062179e-13
52 1538 0 283 1.577897e-07
2 1539 0 287 7.486050e-12
>
>
>
>
>
>
> dev.off()
null device
1
>