Last data update: 2014.03.03

R: Paired variant calling using fisher tests
callVariantsFisherR Documentation

Paired variant calling using fisher tests

Description

This function implements a simple paired variant calling strategy based on the fisher test

Usage

callVariantsPairedFisher(data, sampledata, pValCutOff = 0.05, minCoverage = 5, mergeDels = TRUE, mergeAggregator = mean)

Arguments

data

A list with elements Counts (a 4d integer array of size [1:12, 1:2, 1:k, 1:n]), Coverage (a 3d integer array of size [1:2, 1:k, 1:n]), Reference (a 1d integer vector of size [1:n]) – see Details.

sampledata

A data.frame with k rows (one for each sample) and columns Type, Column and (Group or Patient). The tally file should contain this information as a group attribute, see getSampleData for an example.

pValCutOff

Maximum allowed p-Value for the fisher test on contingency matrix matrix(c(caseCounts, caseCoverage, controlCounts, controlCoverage), nrow=2).

minCoverage

Required coverage in both sample for a call to be made

mergeDels

Boolean flag specifying whether adjacent deletions should be merged

mergeAggregator

Which function to use for aggregating the values associated with adjacent deletions that are being merged

Details

data is a list which has to at least contain the Counts, Coverages and Reference datasets. This list will usually be generated by a call to the h5dapply function in which the tally file, chromosome, datasets and regions within the datasets would be specified. See h5dapply for specifics.

callVariantsPairedFisher implements a simple pairwise variant callign approach based on using the fisher.test on the following contingency matrix:

caseSupport caseCoverage - caseSupport
conttrolSupport controlCoverage - controlSupport

The results are filtered by pValCutOff and minCoverage.

Value

The return value is a data.frame with the following slots:

Chrom

The chromosome the potential variant is on

Start

The starting position of the variant

End

The end position of the variant

Sample

The Case sample in which the variant was observed

refAllele

The reference allele

altAllele

The alternate allele

caseCount

Support for the variant in the Case sample

caseCoverage

Coverage of the variant position in the Case sample

controlCount

Support for the variant in the Control sample

controlCoverage

Coverage of the variant position in the Control sample

pValue

The p.value of the fisher.test

Author(s)

Paul Pyl

Examples

library(h5vc) # loading library
tallyFile <- system.file( "extdata", "example.tally.hfs5", package = "h5vcData" )
sampleData <- getSampleData( tallyFile, "/ExampleStudy/16" )
position <- 29979629
windowsize <- 2000
vars <- h5dapply( # Calling Variants
  filename = tallyFile,
  group = "/ExampleStudy/16",
  blocksize = 1000,
  FUN = callVariantsPairedFisher,
  sampledata = sampleData,
  pValCutOff = 0.1,
  names = c("Coverages", "Counts", "Reference"),
  range = c(position - windowsize, position + windowsize),
  verbose = TRUE
)
vars <- do.call(rbind, vars)
vars

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(h5vc)
Loading required package: grid
Loading required package: gridExtra
Loading required package: ggplot2
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/h5vc/callVariantsFisher.Rd_%03d_medium.png", width=480, height=480)
> ### Name: callVariantsFisher
> ### Title: Paired variant calling using fisher tests
> ### Aliases: callVariantsPairedFisher
> 
> ### ** Examples
> 
> library(h5vc) # loading library
> tallyFile <- system.file( "extdata", "example.tally.hfs5", package = "h5vcData" )
> sampleData <- getSampleData( tallyFile, "/ExampleStudy/16" )
> position <- 29979629
> windowsize <- 2000
> vars <- h5dapply( # Calling Variants
+   filename = tallyFile,
+   group = "/ExampleStudy/16",
+   blocksize = 1000,
+   FUN = callVariantsPairedFisher,
+   sampledata = sampleData,
+   pValCutOff = 0.1,
+   names = c("Coverages", "Counts", "Reference"),
+   range = c(position - windowsize, position + windowsize),
+   verbose = TRUE
+ )
Guessing chromosome length from 'Counts' dataset, if one of the other dimensions happens to match this length there might be weird behaviour here. In that case, specify the dimensions explicitly.
Processing Block #1: 29977629:29978628
Processing Block #2: 29978629:29979628
Processing Block #3: 29979629:29980628
Processing Block #4: 29980629:29981628
> vars <- do.call(rbind, vars)
> vars
                                  Sample Chrom    Start      End refAllele
29977629:29978628.375      PT5PrimaryDNA    16 29978003 29978003         C
29977629:29978628.515      PT5RelapseDNA    16 29978143 29978143         T
29977629:29978628.3751     PT5RelapseDNA    16 29978003 29978003         C
29977629:29978628.3752  PT8EarlyStageDNA    16 29978003 29978003         C
29978629:29979628.199      PT5PrimaryDNA    16 29978827 29978827         G
29978629:29979628.433      PT5PrimaryDNA    16 29979061 29979061         C
29978629:29979628.1000     PT5PrimaryDNA    16 29979628 29979628         G
29978629:29979628.582      PT5PrimaryDNA    16 29979210 29979210         T
29978629:29979628.437      PT5PrimaryDNA    16 29979065 29979065         A
29978629:29979628.436      PT5PrimaryDNA    16 29979064 29979064         C
29978629:29979628.581      PT5PrimaryDNA    16 29979209 29979209         C
29978629:29979628.592      PT5PrimaryDNA    16 29979220 29979220         C
29978629:29979628.119      PT5RelapseDNA    16 29978747 29978747         C
29978629:29979628.1991     PT5RelapseDNA    16 29978827 29978827         G
29978629:29979628.4331     PT5RelapseDNA    16 29979061 29979061         C
29978629:29979628.10001    PT5RelapseDNA    16 29979628 29979628         G
29978629:29979628.4371     PT5RelapseDNA    16 29979065 29979065         A
29978629:29979628.4361     PT5RelapseDNA    16 29979064 29979064         C
29978629:29979628.1992  PT8EarlyStageDNA    16 29978827 29978827         G
29978629:29979628.4332  PT8EarlyStageDNA    16 29979061 29979061         C
29978629:29979628.4372  PT8EarlyStageDNA    16 29979065 29979065         A
29978629:29979628.4362  PT8EarlyStageDNA    16 29979064 29979064         C
29979629:29980628.878      PT5RelapseDNA    16 29980506 29980506         T
29979629:29980628.884   PT8EarlyStageDNA    16 29980512 29980512         G
29980629:29981628.801      PT8PrimaryDNA    16 29981429 29981429         T
29980629:29981628.807      PT8PrimaryDNA    16 29981435 29981435         A
29980629:29981628.809      PT8PrimaryDNA    16 29981437 29981437         C
29980629:29981628.672      PT5PrimaryDNA    16 29981300 29981300         C
29980629:29981628.942      PT5PrimaryDNA    16 29981570 29981570         G
29980629:29981628.947      PT5PrimaryDNA    16 29981575 29981575         G
29980629:29981628.679      PT5PrimaryDNA    16 29981307 29981307         A
29980629:29981628.685      PT5PrimaryDNA    16 29981313 29981313         A
29980629:29981628.972      PT5PrimaryDNA    16 29981600 29981600         A
29980629:29981628.686      PT5PrimaryDNA    16 29981314 29981314         G
29980629:29981628.6721     PT5RelapseDNA    16 29981300 29981300         C
29980629:29981628.253      PT5RelapseDNA    16 29980881 29980881         T
29980629:29981628.6791     PT5RelapseDNA    16 29981307 29981307         A
29980629:29981628.6851     PT5RelapseDNA    16 29981313 29981313         A
29980629:29981628.277      PT5RelapseDNA    16 29980905 29980905         T
29980629:29981628.6861     PT5RelapseDNA    16 29981314 29981314         G
29980629:29981628.124   PT8EarlyStageDNA    16 29980752 29980752         G
29980629:29981628.832   PT8EarlyStageDNA    16 29981460 29981460         C
29980629:29981628.8091  PT8EarlyStageDNA    16 29981437 29981437         C
                        altAllele caseCoverage controlCoverage caseCount
29977629:29978628.375           T           70              49        69
29977629:29978628.515           A           24              54         2
29977629:29978628.3751          T           25              49        25
29977629:29978628.3752          T           44              26        44
29978629:29979628.199           A           43              45        43
29978629:29979628.433           A           21              30        21
29978629:29979628.1000          A           50              42         0
29978629:29979628.582           C           57              41        11
29978629:29979628.437           G           21              31        21
29978629:29979628.436           T           21              30        21
29978629:29979628.581           T           57              42        10
29978629:29979628.592           T           51              41        10
29978629:29979628.119           A           22              54         2
29978629:29979628.1991          A           14              45        14
29978629:29979628.4331          A           16              30        16
29978629:29979628.10001         A           27              42         0
29978629:29979628.4371          G           16              31        16
29978629:29979628.4361          T           16              30        16
29978629:29979628.1992          A           26              16        26
29978629:29979628.4332          A           26              12        25
29978629:29979628.4372          G           26              12        23
29978629:29979628.4362          T           26              12        23
29979629:29980628.878           C           12              35         2
29979629:29980628.884           C           35              13         0
29980629:29981628.801           G           21              14         2
29980629:29981628.807           G           22              14         1
29980629:29981628.809           T           23              15         0
29980629:29981628.672           A          148              99       111
29980629:29981628.942           A           44              40         0
29980629:29981628.947           A           43              34         0
29980629:29981628.679           C          205             138       165
29980629:29981628.685           C          222             156       182
29980629:29981628.972           G           38              36         0
29980629:29981628.686           T          224             155       179
29980629:29981628.6721          A           63              99        49
29980629:29981628.253           C           29              45         3
29980629:29981628.6791          C           75             138        63
29980629:29981628.6851          C           78             156        68
29980629:29981628.277           G           32              45         3
29980629:29981628.6861          T           78             155        68
29980629:29981628.124           A           60              24         1
29980629:29981628.832           G           30              13         0
29980629:29981628.8091          T           26              15         0
                        controlCount       pValue     caseAF  controlAF
29977629:29978628.375             25 1.357753e-10 0.98571429 0.51020408
29977629:29978628.515              0 9.190809e-02 0.08333333 0.00000000
29977629:29978628.3751            25 4.009040e-06 1.00000000 0.51020408
29977629:29978628.3752            11 1.070876e-08 1.00000000 0.42307692
29978629:29979628.199             23 1.705000e-08 1.00000000 0.51111111
29978629:29979628.433             13 9.747510e-06 1.00000000 0.43333333
29978629:29979628.1000            17 1.806672e-07 0.00000000 0.40476190
29978629:29979628.582              2 6.688831e-02 0.19298246 0.04878049
29978629:29979628.437             13 5.059579e-06 1.00000000 0.41935484
29978629:29979628.436             13 9.747510e-06 1.00000000 0.43333333
29978629:29979628.581              2 6.630190e-02 0.17543860 0.04761905
29978629:29979628.592              2 5.921127e-02 0.19607843 0.04878049
29978629:29979628.119              0 8.105263e-02 0.09090909 0.00000000
29978629:29979628.1991            23 9.051896e-04 1.00000000 0.51111111
29978629:29979628.4331            13 7.749970e-05 1.00000000 0.43333333
29978629:29979628.10001           17 8.390247e-05 0.00000000 0.40476190
29978629:29979628.4371            13 6.681439e-05 1.00000000 0.41935484
29978629:29979628.4361            13 7.749970e-05 1.00000000 0.43333333
29978629:29979628.1992             8 1.090399e-04 1.00000000 0.50000000
29978629:29979628.4332             4 8.030101e-05 0.96153846 0.33333333
29978629:29979628.4372             4 1.130394e-03 0.88461538 0.33333333
29978629:29979628.4362             4 1.130394e-03 0.88461538 0.33333333
29979629:29980628.878              0 6.105458e-02 0.16666667 0.00000000
29979629:29980628.884              2 6.914894e-02 0.00000000 0.15384615
29980629:29981628.801              7 1.529180e-02 0.09523810 0.50000000
29980629:29981628.807              5 2.415402e-02 0.04545455 0.35714286
29980629:29981628.809              3 5.393551e-02 0.00000000 0.20000000
29980629:29981628.672             52 3.483574e-04 0.75000000 0.52525253
29980629:29981628.942              4 4.736458e-02 0.00000000 0.10000000
29980629:29981628.947              3 8.180451e-02 0.00000000 0.08823529
29980629:29981628.679             82 2.764340e-05 0.80487805 0.59420290
29980629:29981628.685             94 3.815264e-06 0.81981982 0.60256410
29980629:29981628.972              4 5.119387e-02 0.00000000 0.11111111
29980629:29981628.686             93 2.826117e-05 0.79910714 0.60000000
29980629:29981628.6721            52 1.501218e-03 0.77777778 0.52525253
29980629:29981628.253              0 5.636801e-02 0.10344828 0.00000000
29980629:29981628.6791            82 2.076025e-04 0.84000000 0.59420290
29980629:29981628.6851            94 2.010017e-05 0.87179487 0.60256410
29980629:29981628.277              0 6.780588e-02 0.09375000 0.00000000
29980629:29981628.6861            93 1.979340e-05 0.87179487 0.60000000
29980629:29981628.124              3 6.844568e-02 0.01666667 0.12500000
29980629:29981628.832              2 8.637874e-02 0.00000000 0.15384615
29980629:29981628.8091             3 4.268293e-02 0.00000000 0.20000000
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>