R: Identify all substitutions observed across genomic positions...
getAllSub
R Documentation
Identify all substitutions observed across genomic positions exhibiting a
specified minimum coverage
Description
All substitutions observed across genomic positions exhibiting user-defined
minimum coverage are extracted and a count table is returned. This function
supports parallel computing.
Usage
getAllSub(sortedBam, minCov = 20, cores = 1)
Arguments
sortedBam
GRanges object containing aligned reads as returned by
readSortedBam
minCov
An integer defining the minimum coverage required at a genomic
position exhibiting a substitution. Genomic positions of coverage less than
minCov are discarded. Default is 20 (see Details).
cores
An integer defining the number of cores to be used for parallel
processing, if available. Default is 1.
Details
The choice of the minimum coverage influences the variance of the relative
substitution frequency estimates, which in turn affect the mixture model
fit. A conservative value depending on the library size is recommended for a
first analysis. Values smaller than 10 have not been tested and are
therefore not recommended.
Value
A GRanges object containing a count table, where each range
correspond to a substitution. The metadata correspond to the following
information:
substitutions
observed substitution, e.g. AT, i.e. A in
the reference sequence and T in the mapped read.
coverage
strand-specific coverage.
count
number of
strand-specific substitutions.
Author(s)
Federico Comoglio and Cem Sievers, with contributions from Martin Morgan