Counts the number of reads with a specified minimum mapping quality
from BAM files in genomic ranges specified by a GRanges object. This
is a convenience function for counting the reads in ranges covering
the targeted regions, such as the exons in exome enrichment
experiments, from each sample. These read counts are used by
exomeCopy in predicting CNVs in samples.
With the default setting (read.width=1), only the read starts
are used for counting purposes (the leftmost position regardless of
the strandedness of the read).
With the accurate read width, or with get.width = TRUE, then
the function returns the number of overlapping reads, as returned by
countOverlaps in the GenomicRanges package.
The function subdivideGRanges can be used first to
subdivide ranges of different size into ranges of nearly equal width.
The BAM file requires a associated index file (see the man page for
indexBam in the Rsamtools package).
The path of the BAM file for the sample to be counted.
granges
An object of type GRanges with the ranges in which to count reads.
min.mapq
The minimum mapping quality to count a read. Defaults to 1. Set to
0 for counting all reads.
read.width
The width of a read, used in counting overlaps of mapped reads with
the genomic ranges. The default is 1, resulting in the counting of
only read starts in genomic ranges. If the length of fixed width
reads is used, e.g. 100 for 100bp reads, then the function will
return the count of all overlapping reads with the genomic ranges.
However, counting all overlapping reads introduces dependency
between the counts in adjacent windows.
stranded.start
If true, the function will create reads of length read.width
using the strand to determine the read location. A read with + or *
strand will start at the given start position, and a read with -
strand will end at (start position + CIGAR width - 1).
get.width
If true, the function should retrieve the read width from the CIGAR
encoding rather than assign the value from read.width.
remove.dup
If true, the function will count only one read for each unique
combination of position, strand and read width.
Value
An integer vector giving the number of reads over the input GRanges
See Also
RsamtoolsGRangessubdivideGRanges
Examples
## get subdivided genomic ranges covering targeted region
## using subdivideGRanges()
example(subdivideGRanges)
## BAM file included in Rsamtools package
bam.file <- system.file("extdata", "mapping.bam", package="exomeCopy")
## create RangedData object to store read counts
rdata <- RangedData(space=seqnames(target.sub),ranges=ranges(target.sub))
## extract read counts from the BAM file in these genomic ranges
rdata[["sample"]] <- countBamInGRanges(bam.file,target.sub)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(exomeCopy)
Loading required package: IRanges
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: Rsamtools
Loading required package: Biostrings
Loading required package: XVector
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/exomeCopy/countBamInGRanges.Rd_%03d_medium.png", width=480, height=480)
> ### Name: countBamInGRanges
> ### Title: Count reads from BAM file in genomic ranges
> ### Aliases: countBamInGRanges
>
> ### ** Examples
>
> ## get subdivided genomic ranges covering targeted region
> ## using subdivideGRanges()
> example(subdivideGRanges)
sbdvGR> ## read in target region BED file
sbdvGR> target.file <- system.file("extdata", "targets.bed", package="exomeCopy")
sbdvGR> target.df <- read.delim(target.file, header=FALSE,
sbdvGR+ col.names=c("seqname","start","end"))
sbdvGR> ## create GRanges object with 5 ranges over 2 sequences
sbdvGR> target <- GRanges(seqname=target.df$seqname,
sbdvGR+ IRanges(start=target.df$start,end=target.df$end))
sbdvGR> ## subdivide into 7 smaller genomic ranges
sbdvGR> target.sub <- subdivideGRanges(target)
>
> ## BAM file included in Rsamtools package
> bam.file <- system.file("extdata", "mapping.bam", package="exomeCopy")
>
> ## create RangedData object to store read counts
> rdata <- RangedData(space=seqnames(target.sub),ranges=ranges(target.sub))
>
> ## extract read counts from the BAM file in these genomic ranges
> rdata[["sample"]] <- countBamInGRanges(bam.file,target.sub)
>
>
>
>
>
> dev.off()
null device
1
>