A list with elements Counts (a 4d
integer array of size [1:12, 1:2, 1:k, 1:n]),
Coverage (a 3d integer array of size [1:2, 1:k, 1:n]),
Reference (a 1d integer vector of size [1:n]) – see Details.
sampledata
A data.frame with k rows (one for each
sample) and columns Type, Column and (Group
or Patient). The tally file should contain this information as
a group attribute, see getSampleData for an example.
pValCutOff
Maximum allowed p-Value for the fisher test on contingency matrix matrix(c(caseCounts, caseCoverage, controlCounts, controlCoverage), nrow=2).
minCoverage
Required coverage in both sample for a call to be made
mergeDels
Boolean flag specifying whether adjacent deletions should be merged
mergeAggregator
Which function to use for aggregating the values associated with adjacent deletions that are being merged
Details
data is a list which has to at least contain the
Counts, Coverages and Reference datasets. This list will usually be
generated by a call to the h5dapply function in which the tally
file, chromosome, datasets and regions within the datasets would be
specified. See h5dapply for specifics.
callVariantsPairedFisher implements a simple pairwise variant
callign approach based on using the fisher.test on the following contingency matrix:
caseSupport
caseCoverage - caseSupport
conttrolSupport
controlCoverage - controlSupport
The results are filtered by pValCutOff and minCoverage.
Value
The return value is a data.frame with the following slots:
Chrom
The chromosome the potential variant is on
Start
The starting position of the variant
End
The end position of the variant
Sample
The Case sample in which the variant was observed
refAllele
The reference allele
altAllele
The alternate allele
caseCount
Support for the variant in the Case sample
caseCoverage
Coverage of the variant position in the Case sample
controlCount
Support for the variant in the Control sample
controlCoverage
Coverage of the variant position in the Control sample
pValue
The p.value of the fisher.test
Author(s)
Paul Pyl
Examples
library(h5vc) # loading library
tallyFile <- system.file( "extdata", "example.tally.hfs5", package = "h5vcData" )
sampleData <- getSampleData( tallyFile, "/ExampleStudy/16" )
position <- 29979629
windowsize <- 2000
vars <- h5dapply( # Calling Variants
filename = tallyFile,
group = "/ExampleStudy/16",
blocksize = 1000,
FUN = callVariantsPairedFisher,
sampledata = sampleData,
pValCutOff = 0.1,
names = c("Coverages", "Counts", "Reference"),
range = c(position - windowsize, position + windowsize),
verbose = TRUE
)
vars <- do.call(rbind, vars)
vars
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(h5vc)
Loading required package: grid
Loading required package: gridExtra
Loading required package: ggplot2
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/h5vc/callVariantsFisher.Rd_%03d_medium.png", width=480, height=480)
> ### Name: callVariantsFisher
> ### Title: Paired variant calling using fisher tests
> ### Aliases: callVariantsPairedFisher
>
> ### ** Examples
>
> library(h5vc) # loading library
> tallyFile <- system.file( "extdata", "example.tally.hfs5", package = "h5vcData" )
> sampleData <- getSampleData( tallyFile, "/ExampleStudy/16" )
> position <- 29979629
> windowsize <- 2000
> vars <- h5dapply( # Calling Variants
+ filename = tallyFile,
+ group = "/ExampleStudy/16",
+ blocksize = 1000,
+ FUN = callVariantsPairedFisher,
+ sampledata = sampleData,
+ pValCutOff = 0.1,
+ names = c("Coverages", "Counts", "Reference"),
+ range = c(position - windowsize, position + windowsize),
+ verbose = TRUE
+ )
Guessing chromosome length from 'Counts' dataset, if one of the other dimensions happens to match this length there might be weird behaviour here. In that case, specify the dimensions explicitly.
Processing Block #1: 29977629:29978628
Processing Block #2: 29978629:29979628
Processing Block #3: 29979629:29980628
Processing Block #4: 29980629:29981628
> vars <- do.call(rbind, vars)
> vars
Sample Chrom Start End refAllele
29977629:29978628.375 PT5PrimaryDNA 16 29978003 29978003 C
29977629:29978628.515 PT5RelapseDNA 16 29978143 29978143 T
29977629:29978628.3751 PT5RelapseDNA 16 29978003 29978003 C
29977629:29978628.3752 PT8EarlyStageDNA 16 29978003 29978003 C
29978629:29979628.199 PT5PrimaryDNA 16 29978827 29978827 G
29978629:29979628.433 PT5PrimaryDNA 16 29979061 29979061 C
29978629:29979628.1000 PT5PrimaryDNA 16 29979628 29979628 G
29978629:29979628.582 PT5PrimaryDNA 16 29979210 29979210 T
29978629:29979628.437 PT5PrimaryDNA 16 29979065 29979065 A
29978629:29979628.436 PT5PrimaryDNA 16 29979064 29979064 C
29978629:29979628.581 PT5PrimaryDNA 16 29979209 29979209 C
29978629:29979628.592 PT5PrimaryDNA 16 29979220 29979220 C
29978629:29979628.119 PT5RelapseDNA 16 29978747 29978747 C
29978629:29979628.1991 PT5RelapseDNA 16 29978827 29978827 G
29978629:29979628.4331 PT5RelapseDNA 16 29979061 29979061 C
29978629:29979628.10001 PT5RelapseDNA 16 29979628 29979628 G
29978629:29979628.4371 PT5RelapseDNA 16 29979065 29979065 A
29978629:29979628.4361 PT5RelapseDNA 16 29979064 29979064 C
29978629:29979628.1992 PT8EarlyStageDNA 16 29978827 29978827 G
29978629:29979628.4332 PT8EarlyStageDNA 16 29979061 29979061 C
29978629:29979628.4372 PT8EarlyStageDNA 16 29979065 29979065 A
29978629:29979628.4362 PT8EarlyStageDNA 16 29979064 29979064 C
29979629:29980628.878 PT5RelapseDNA 16 29980506 29980506 T
29979629:29980628.884 PT8EarlyStageDNA 16 29980512 29980512 G
29980629:29981628.801 PT8PrimaryDNA 16 29981429 29981429 T
29980629:29981628.807 PT8PrimaryDNA 16 29981435 29981435 A
29980629:29981628.809 PT8PrimaryDNA 16 29981437 29981437 C
29980629:29981628.672 PT5PrimaryDNA 16 29981300 29981300 C
29980629:29981628.942 PT5PrimaryDNA 16 29981570 29981570 G
29980629:29981628.947 PT5PrimaryDNA 16 29981575 29981575 G
29980629:29981628.679 PT5PrimaryDNA 16 29981307 29981307 A
29980629:29981628.685 PT5PrimaryDNA 16 29981313 29981313 A
29980629:29981628.972 PT5PrimaryDNA 16 29981600 29981600 A
29980629:29981628.686 PT5PrimaryDNA 16 29981314 29981314 G
29980629:29981628.6721 PT5RelapseDNA 16 29981300 29981300 C
29980629:29981628.253 PT5RelapseDNA 16 29980881 29980881 T
29980629:29981628.6791 PT5RelapseDNA 16 29981307 29981307 A
29980629:29981628.6851 PT5RelapseDNA 16 29981313 29981313 A
29980629:29981628.277 PT5RelapseDNA 16 29980905 29980905 T
29980629:29981628.6861 PT5RelapseDNA 16 29981314 29981314 G
29980629:29981628.124 PT8EarlyStageDNA 16 29980752 29980752 G
29980629:29981628.832 PT8EarlyStageDNA 16 29981460 29981460 C
29980629:29981628.8091 PT8EarlyStageDNA 16 29981437 29981437 C
altAllele caseCoverage controlCoverage caseCount
29977629:29978628.375 T 70 49 69
29977629:29978628.515 A 24 54 2
29977629:29978628.3751 T 25 49 25
29977629:29978628.3752 T 44 26 44
29978629:29979628.199 A 43 45 43
29978629:29979628.433 A 21 30 21
29978629:29979628.1000 A 50 42 0
29978629:29979628.582 C 57 41 11
29978629:29979628.437 G 21 31 21
29978629:29979628.436 T 21 30 21
29978629:29979628.581 T 57 42 10
29978629:29979628.592 T 51 41 10
29978629:29979628.119 A 22 54 2
29978629:29979628.1991 A 14 45 14
29978629:29979628.4331 A 16 30 16
29978629:29979628.10001 A 27 42 0
29978629:29979628.4371 G 16 31 16
29978629:29979628.4361 T 16 30 16
29978629:29979628.1992 A 26 16 26
29978629:29979628.4332 A 26 12 25
29978629:29979628.4372 G 26 12 23
29978629:29979628.4362 T 26 12 23
29979629:29980628.878 C 12 35 2
29979629:29980628.884 C 35 13 0
29980629:29981628.801 G 21 14 2
29980629:29981628.807 G 22 14 1
29980629:29981628.809 T 23 15 0
29980629:29981628.672 A 148 99 111
29980629:29981628.942 A 44 40 0
29980629:29981628.947 A 43 34 0
29980629:29981628.679 C 205 138 165
29980629:29981628.685 C 222 156 182
29980629:29981628.972 G 38 36 0
29980629:29981628.686 T 224 155 179
29980629:29981628.6721 A 63 99 49
29980629:29981628.253 C 29 45 3
29980629:29981628.6791 C 75 138 63
29980629:29981628.6851 C 78 156 68
29980629:29981628.277 G 32 45 3
29980629:29981628.6861 T 78 155 68
29980629:29981628.124 A 60 24 1
29980629:29981628.832 G 30 13 0
29980629:29981628.8091 T 26 15 0
controlCount pValue caseAF controlAF
29977629:29978628.375 25 1.357753e-10 0.98571429 0.51020408
29977629:29978628.515 0 9.190809e-02 0.08333333 0.00000000
29977629:29978628.3751 25 4.009040e-06 1.00000000 0.51020408
29977629:29978628.3752 11 1.070876e-08 1.00000000 0.42307692
29978629:29979628.199 23 1.705000e-08 1.00000000 0.51111111
29978629:29979628.433 13 9.747510e-06 1.00000000 0.43333333
29978629:29979628.1000 17 1.806672e-07 0.00000000 0.40476190
29978629:29979628.582 2 6.688831e-02 0.19298246 0.04878049
29978629:29979628.437 13 5.059579e-06 1.00000000 0.41935484
29978629:29979628.436 13 9.747510e-06 1.00000000 0.43333333
29978629:29979628.581 2 6.630190e-02 0.17543860 0.04761905
29978629:29979628.592 2 5.921127e-02 0.19607843 0.04878049
29978629:29979628.119 0 8.105263e-02 0.09090909 0.00000000
29978629:29979628.1991 23 9.051896e-04 1.00000000 0.51111111
29978629:29979628.4331 13 7.749970e-05 1.00000000 0.43333333
29978629:29979628.10001 17 8.390247e-05 0.00000000 0.40476190
29978629:29979628.4371 13 6.681439e-05 1.00000000 0.41935484
29978629:29979628.4361 13 7.749970e-05 1.00000000 0.43333333
29978629:29979628.1992 8 1.090399e-04 1.00000000 0.50000000
29978629:29979628.4332 4 8.030101e-05 0.96153846 0.33333333
29978629:29979628.4372 4 1.130394e-03 0.88461538 0.33333333
29978629:29979628.4362 4 1.130394e-03 0.88461538 0.33333333
29979629:29980628.878 0 6.105458e-02 0.16666667 0.00000000
29979629:29980628.884 2 6.914894e-02 0.00000000 0.15384615
29980629:29981628.801 7 1.529180e-02 0.09523810 0.50000000
29980629:29981628.807 5 2.415402e-02 0.04545455 0.35714286
29980629:29981628.809 3 5.393551e-02 0.00000000 0.20000000
29980629:29981628.672 52 3.483574e-04 0.75000000 0.52525253
29980629:29981628.942 4 4.736458e-02 0.00000000 0.10000000
29980629:29981628.947 3 8.180451e-02 0.00000000 0.08823529
29980629:29981628.679 82 2.764340e-05 0.80487805 0.59420290
29980629:29981628.685 94 3.815264e-06 0.81981982 0.60256410
29980629:29981628.972 4 5.119387e-02 0.00000000 0.11111111
29980629:29981628.686 93 2.826117e-05 0.79910714 0.60000000
29980629:29981628.6721 52 1.501218e-03 0.77777778 0.52525253
29980629:29981628.253 0 5.636801e-02 0.10344828 0.00000000
29980629:29981628.6791 82 2.076025e-04 0.84000000 0.59420290
29980629:29981628.6851 94 2.010017e-05 0.87179487 0.60256410
29980629:29981628.277 0 6.780588e-02 0.09375000 0.00000000
29980629:29981628.6861 93 1.979340e-05 0.87179487 0.60000000
29980629:29981628.124 3 6.844568e-02 0.01666667 0.12500000
29980629:29981628.832 2 8.637874e-02 0.00000000 0.15384615
29980629:29981628.8091 3 4.268293e-02 0.00000000 0.20000000
>
>
>
>
>
> dev.off()
null device
1
>