Last data update: 2014.03.03

R: Detection of subclonal SNVs in deep sequencing experiments
deepSNV-packageR Documentation

Detection of subclonal SNVs in deep sequencing experiments

Description

Detection of subclonal SNVs in deep sequencing experiments

Details

This packages provides algorithms for detecting subclonal single nucleotide variants (SNVs) and their frequencies from ultra-deep sequencing data. It retrieves the nucleotide counts at each position and each strand from two .bam files and tests for differences between the two experiments with a likelihood ratio test using either a binomial or and overdispersed beta-binomial model. The statistic can be tuned across genomic sites by a shared Dirichlet prior and there package provides procedures for normalizing sequencing data from different runs.

Author(s)

Moritz Gerstung, Wellcome Trust Sanger Institute, moritz.gerstung@sanger.ac.uk

References

Gerstung M, Beisel C, Rechsteiner M, Wild P, Schraml P, Moch H, and Beerenwinkel N. Reliable detection of subclonal single-nucleotide variants in tumour cell populations. Nat Commun 3:811 (2012). DOI:10.1038/ncomms1814.

See Also

deepSNV

Examples

## Short example with 2 SNVs at frequency ~10%
regions <- data.frame(chr="B.FR.83.HXB2_LAI_IIIB_BRU_K034", start = 3120, stop=3140)
ex <- deepSNV(test = system.file("extdata", "test.bam", package="deepSNV"), control = system.file("extdata", "control.bam", package="deepSNV"), regions=regions, q=10)
show(ex)   # show method
plot(ex)   # scatter plot
summary(ex)   # summary with significant SNVs
ex[1:3,]   # subsetting the first three genomic positions
tail(test(ex, total=TRUE))   # retrieve the test counts on both strands
tail(control(ex, total=TRUE))

## Not run: Full example with ~ 100 SNVs. Requires an internet connection, but try yourself.
# regions <- data.frame(chr="B.FR.83.HXB2_LAI_IIIB_BRU_K034", start = 2074, stop=3585)
# HIVmix <- deepSNV(test = "http://www.bsse.ethz.ch/cbg/software/deepSNV/data/test.bam", control = "http://www.bsse.ethz.ch/cbg/software/deepSNV/data/control.bam", regions=regions, q=10)
data(HIVmix) # attach data instead..
show(HIVmix)
plot(HIVmix)
head(summary(HIVmix))

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(deepSNV)
Loading required package: parallel
Loading required package: Rhtslib
Rhtslib htslib version 1.1
Loading required package: IRanges
Loading required package: BiocGenerics

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: Biostrings
Loading required package: XVector
Loading required package: VGAM
Loading required package: splines
Loading required package: VariantAnnotation
Loading required package: Rsamtools

Attaching package: 'VariantAnnotation'

The following object is masked from 'package:base':

    tabulate


Attaching package: 'deepSNV'

The following objects are masked from 'package:VGAM':

    dbetabinom, pbetabinom

The following object is masked from 'package:BiocGenerics':

    normalize

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/deepSNV/deepSNV-package.Rd_%03d_medium.png", width=480, height=480)
> ### Name: deepSNV-package
> ### Title: Detection of subclonal SNVs in deep sequencing experiments
> ### Aliases: deepSNV-package
> ### Keywords: package
> 
> ### ** Examples
> 
> ## Short example with 2 SNVs at frequency ~10%
> regions <- data.frame(chr="B.FR.83.HXB2_LAI_IIIB_BRU_K034", start = 3120, stop=3140)
> ex <- deepSNV(test = system.file("extdata", "test.bam", package="deepSNV"), control = system.file("extdata", "control.bam", package="deepSNV"), regions=regions, q=10)
> show(ex)   # show method
Data:  21 positions x  10 characters
Model:  bin 
Alternative:  greater 
Combine Method:  fisher 
P-Values:
             A            T          C         G         -
[1,] 0.5965736           NA 0.59657359 0.5965736 0.5965736
[2,] 0.4378589 8.465736e-01         NA 0.5965736 0.5965736
[3,] 0.5965736           NA 0.07559581 0.8465736 0.5965736
[4,] 0.3962578 5.965736e-01 0.59657359        NA 0.5965736
[5,]        NA 4.369021e-01 0.59657359 0.8465736 0.5965736
[6,] 0.8177195 6.404014e-39         NA 0.5965736 0.5965736
...
               A         T          C            G         -
[16,] 0.47970559 0.5965736 0.84657359           NA 0.5965736
[17,] 0.05350392 0.5965736 0.84657359           NA 0.5965736
[18,] 1.00000000 0.8465736 1.00000000           NA 0.5965736
[19,] 0.24915660 0.8465736         NA 1.011605e-01 0.5965736
[20,]         NA 0.4422493 0.07253975 4.351952e-02 0.5965736
[21,]         NA 0.8465736 0.84657359 1.747374e-45 0.5965736
> plot(ex)   # scatter plot
> summary(ex)   # summary with significant SNVs
                             chr  pos ref var        p.val   freq.var
1 B.FR.83.HXB2_LAI_IIIB_BRU_K034 3125   C   T 5.379372e-37 0.03828036
2 B.FR.83.HXB2_LAI_IIIB_BRU_K034 3140   A   G 1.467794e-43 0.04875622
  sigma2.freq.var n.tst.fw cov.tst.fw n.tst.bw cov.tst.bw n.ctrl.fw cov.ctrl.fw
1    1.678502e-05       58       1461       32        862         1        3066
2    2.425683e-05       60       1346       38        664         0        2775
  n.ctrl.bw cov.ctrl.bw    raw.p.val
1         1        1257 6.404014e-39
2         0         986 1.747374e-45
> ex[1:3,]   # subsetting the first three genomic positions
Data:  3 positions x  10 characters
Model:  bin 
Alternative:  greater 
Combine Method:  fisher 
P-Values:
             A         T          C         G         -
[1,] 0.5965736        NA 0.59657359 0.5965736 0.5965736
[2,] 0.4378589 0.8465736         NA 0.5965736 0.5965736
[3,] 0.5965736        NA 0.07559581 0.8465736 0.5965736
> tail(test(ex, total=TRUE))   # retrieve the test counts on both strands
         A T    C    G -
[16,]    3 0    0 2172 0
[17,]    4 0    0 2140 0
[18,]    0 0    1 2116 6
[19,]    6 0 2073   10 0
[20,] 2072 1    8    3 0
[21,] 1911 0    1   98 0
> tail(control(ex, total=TRUE))
         A T    C    G -
[16,]    2 0    1 4043 0
[17,]    0 0    1 3998 0
[18,]    3 1    4 3945 8
[19,]    5 1 3908    8 0
[20,] 3897 0    5    0 0
[21,] 3757 1    3    0 0
> 
> ## Not run: Full example with ~ 100 SNVs. Requires an internet connection, but try yourself.
> # regions <- data.frame(chr="B.FR.83.HXB2_LAI_IIIB_BRU_K034", start = 2074, stop=3585)
> # HIVmix <- deepSNV(test = "http://www.bsse.ethz.ch/cbg/software/deepSNV/data/test.bam", control = "http://www.bsse.ethz.ch/cbg/software/deepSNV/data/control.bam", regions=regions, q=10)
> data(HIVmix) # attach data instead..
> show(HIVmix)
Data:  1512 positions x  10 characters
Model:  bin 
Alternative:  greater 
Combine Method:  fisher 
P-Values:
             A         T         C         G         -
[1,]        NA 0.5965736 0.5965736 0.5965736 0.5965736
[2,] 0.5965736 0.5965736 0.5965736        NA 0.5965736
[3,]        NA 0.5965736 0.5965736 0.5965736 0.5965736
[4,] 0.5965736 0.5965736        NA 0.5965736 0.5965736
[5,]        NA 0.5965736 0.5965736 0.5965736 0.5965736
[6,] 0.5965736 0.5965736 0.5965736        NA 0.5965736
...
                A         T         C         G         -
[1507,]        NA 0.5965736 0.5965736 0.5965736 0.5965736
[1508,] 0.8465736 0.5965736 0.8465736        NA 0.5965736
[1509,] 0.5965736 0.5965736        NA 0.5965736 0.5965736
[1510,] 0.8465736 0.5965736        NA 0.5965736 0.5965736
[1511,]        NA 0.5965736 0.5965736 0.4737885 0.5965736
[1512,] 1.0000000        NA 0.8465736 0.8465736 0.8465736
> plot(HIVmix)
> head(summary(HIVmix))
                              chr  pos ref var        p.val   freq.var
1  B.FR.83.HXB2_LAI_IIIB_BRU_K034 2127   G   A 3.814024e-05 0.02903526
51 B.FR.83.HXB2_LAI_IIIB_BRU_K034 2130   T   C 1.423636e-07 0.03076923
70 B.FR.83.HXB2_LAI_IIIB_BRU_K034 2139   A   G 5.815022e-09 0.03362573
71 B.FR.83.HXB2_LAI_IIIB_BRU_K034 2141   A   G 4.271206e-09 0.03333333
52 B.FR.83.HXB2_LAI_IIIB_BRU_K034 2150   A   C 9.543121e-04 0.01763908
2  B.FR.83.HXB2_LAI_IIIB_BRU_K034 2151   G   A 4.527563e-08 0.02815013
   sigma2.freq.var n.tst.fw cov.tst.fw n.tst.bw cov.tst.bw n.ctrl.fw
1     4.892000e-05       17        581        2         47         2
51    4.733728e-05       16        597        4         53         0
70    4.916043e-05       16        599        7         85         0
71    4.830918e-05       16        599        7         91         0
52    2.393362e-05       10        609        3        128         0
2     3.773476e-05       16        614        5        132         0
   cov.ctrl.fw n.ctrl.bw cov.ctrl.bw    raw.p.val
1         1537         0         103 6.306257e-09
51        1534         0         104 2.353895e-11
70        1535         0         158 9.614784e-13
71        1535         0         181 7.062179e-13
52        1538         0         283 1.577897e-07
2         1539         0         287 7.486050e-12
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>