Use ScanBamParam() to create a parameter object influencing
what fields and which records are imported from a (binary) BAM file.
Use of which requires that a BAM index file
(<filename>.bai) exists.
Usage
# Constructor
ScanBamParam(flag = scanBamFlag(), simpleCigar = FALSE,
reverseComplement = FALSE, tag = character(0), tagFilter = list(),
what = character(0), which, mapqFilter=NA_integer_)
# Constructor helpers
scanBamFlag(isPaired = NA, isProperPair = NA, isUnmappedQuery = NA,
hasUnmappedMate = NA, isMinusStrand = NA, isMateMinusStrand = NA,
isFirstMateRead = NA, isSecondMateRead = NA, isNotPrimaryRead = NA,
isSecondaryAlignment = NA, isNotPassingQualityControls = NA,
isDuplicate = NA)
scanBamWhat()
# Accessors
bamFlag(object, asInteger=FALSE)
bamFlag(object) <- value
bamReverseComplement(object)
bamReverseComplement(object) <- value
bamSimpleCigar(object)
bamSimpleCigar(object) <- value
bamTag(object)
bamTag(object) <- value
bamTagFilter(object)
bamTagFilter(object) <- value
bamWhat(object)
bamWhat(object) <- value
bamWhich(object)
bamWhich(object) <- value
bamMapqFilter(object)
bamMapqFilter(object) <- value
## S4 method for signature 'ScanBamParam'
show(object)
# Flag utils
bamFlagAsBitMatrix(flag, bitnames=FLAG_BITNAMES)
bamFlagAND(flag1, flag2)
bamFlagTest(flag, value)
Arguments
flag
For ScanBamParam, an integer(2) vector used to filter
reads based on their 'flag' entry. This is most easily created with the
scanBamFlag() helper function.
For bamFlagAsBitMatrix, bamFlagTest an integer vector
where each element represents a 'flag' entry.
simpleCigar
A logical(1) vector which, when TRUE, returns only
those reads for which the cigar (run-length encoded representation
of the alignment) is missing or contains only matches / mismatches
('M').
reverseComplement
A logical(1) vectors. BAM files store reads
mapping to the minus strand as though they are on the plus
strand. Rsamtools obeys this convention by default
(reverseComplement=FALSE), but when this value is set to TRUE
returns the sequence and quality scores of reads mapped to the minus
strand in the reverse complement (sequence) and reverse (quality) of
the read as stored in the BAM file. This might be useful if wishing
to recover read and quality scores as represented in fastq files,
but is NOT appropriate for variant calling or other alignment-based
operations.
tag
A character vector naming tags to be extracted. A tag is an
optional field, with arbitrary information, stored with each
record. Tags are identified by two-letter codes, so all elements of
tag must have exactly 2 characters.
tagFilter
A named list of atomic vectors. The name of
each list element is the tag name (two-letter code), and the
corresponding atomic vector is the set of acceptable values for the
tag. Only reads with specified tags are included. NULLs,
NAs, and empty strings are not allowed in the atomic
vectors.
what
A character vector naming the fields to return
scanBamWhat() returns a vector of available fields.
Fields are described on the scanBam help page.
mapqFilter
A non-negative integer(1) specifying the minimum
mapping quality to include. BAM records with mapping qualities less
than mapqFilter are discarded.
which
A GRanges, RangesList,
or any object that can be coerced to a RangesList, or
missing object, from which a IRangesList instance will be
constructed. Names of the IRangesList correspond to reference
sequences, and ranges to the regions on that reference sequence for
which matches are desired. Because data types are coerced to
IRangesList, which does not include strand
information (use the flag argument instead). Only records
with a read overlapping the specified ranges are returned. All
ranges must have ends less than or equal to 536870912.
isPaired
A logical(1) indicating whether unpaired (FALSE),
paired (TRUE), or any (NA) read should be returned.
isProperPair
A logical(1) indicating whether improperly paired
(FALSE), properly paired (TRUE), or any (NA) read should be
returned. A properly paired read is defined by the alignment
algorithm and might, e.g., represent reads aligning to identical
reference sequences and with a specified distance.
isUnmappedQuery
A logical(1) indicating whether unmapped
(TRUE), mapped (FALSE), or any (NA) read should be returned.
hasUnmappedMate
A logical(1) indicating whether reads with
mapped (FALSE), unmapped (TRUE), or any (NA) mate should be
returned.
isMinusStrand
A logical(1) indicating whether reads aligned to
the plus (FALSE), minus (TRUE), or any (NA) strand should be
returned.
isMateMinusStrand
A logical(1) indicating whether mate reads
aligned to the plus (FALSE), minus (TRUE), or any (NA) strand should
be returned.
isFirstMateRead
A logical(1) indicating whether the first mate
read should be returned (TRUE) or not (FALSE), or whether mate read
number should be ignored (NA).
isSecondMateRead
A logical(1) indicating whether the second
mate read should be returned (TRUE) or not (FALSE), or whether mate
read number should be ignored (NA).
isNotPrimaryRead
Deprecated; use isSecondaryAlignment.
isSecondaryAlignment
A logical(1) indicating whether alignments
that are secondary (TRUE), are not (FALSE) or whose secondary status
does not matter (NA) should be returned. A non-primary alignment
(“secondary alignment” in the SAM specification) might result
when a read aligns to multiple locations. One alignment is
designated as primary and has this flag set to FALSE; the remainder,
for which this flag is TRUE, are designated by the aligner as
secondary.
isNotPassingQualityControls
A logical(1) indicating whether
reads passing quality controls (FALSE), reads not passing quality
controls (TRUE), or any (NA) read should be returned.
isDuplicate
A logical(1) indicating that un-duplicated (FALSE),
duplicated (TRUE), or any (NA) reads should be
returned. 'Duplicated' reads may represent PCR or optical
duplicates.
object
An instance of class ScanBamParam.
value
An instance of the corresponding slot, to be assigned to
object or, for bamFlagTest, a character(1) name
of the flag to test, e.g., “isUnmappedQuery”, from the
arguments to scanBamFlag.
asInteger
logical(1) indicating whether ‘flag’ should be
returned as an encoded integer vector (TRUE) or
human-readable form (FALSE).
bitnames
Names of the flag bits to extract. Will be the colnames
of the returned matrix.
flag1, flag2
Integer vectors containing ‘flag’ entries.
Objects from the Class
Objects are created by calls of the form ScanBamParam().
Slots
flag
Object of class integer encoding flags
to be kept when they have their '0' (keep0) or '1'
(keep1) bit set.
simpleCigar
Object of class logical
indicating, when TRUE, that only 'simple' cigars (empty or 'M') are
returned.
reverseComplement
Object of class logical
indicating, when TRUE, that reads on the minus strand are to be
reverse complemented (sequence) and reversed (quality).
tag
Object of class character indicating what
tags are to be returned.
tagFilter
Object of class list (named)
indicating tags to filter by, and the set of acceptable values for
each tag.
what
Object of class character indicating
what fields are to be returned.
which
Object of class RangesList indicating
which reference sequence and coordinate reads must overlap.
mapqFilter
Object of class integer indicating
the minimum mapping quality required for input, or NA to indicate
no filtering.
Functions and methods
See 'Usage' for details on invocation.
Constructor:
ScanBamParam:
Returns a ScanBamParam object. The
which argument to the constructor can be one of several
different types, as documented above.
Accessors:
bamTag, bamTag<-
Returns or sets a character vector of
tags to be extracted.
bamTagFilter, bamTagFilter<-
Returns or sets a named
list of tags to filter by, and the set of their acceptable
values.
bamWhat, bamWhat<-
Returns or sets a character vector
of fields to be extracted.
bamWhich, bamWhich<-
Returns or sets a RangesList of
bounds on reads to be extracted. A length 0 RangesList
represents all reads.
bamFlag, bamFlag<-
Returns or sets an integer(2)
representation of reads flagged to be kept or excluded.
bamSimpleCigar, bamSimpleCigar<-
Returns or sets a
logical(1) vector indicating whether reads without indels
or clipping be kept.
bamReverseComplement, bamReverseComplement<-
Returns or sets
a logical(1) vector indicating whether reads on the minus
strand will be returned with sequence reverse complemented and
quality reversed.
Methods:
show
Compactly display the object.
Author(s)
Martin Morgan
See Also
scanBam
Examples
## defaults
p0 <- ScanBamParam()
## subset of reads based on genomic coordinates
which <- RangesList(seq1=IRanges(1000, 2000),
seq2=IRanges(c(100, 1000), c(1000, 2000)))
p1 <- ScanBamParam(what=scanBamWhat(), which=which)
## subset of reads based on 'flag' value
p2 <- ScanBamParam(what=scanBamWhat(),
flag=scanBamFlag(isMinusStrand=FALSE))
## subset of fields
p3 <- ScanBamParam(what=c("rname", "strand", "pos", "qwidth"))
## use
fl <- system.file("extdata", "ex1.bam", package="Rsamtools",
mustWork=TRUE)
res <- scanBam(fl, param=p2)[[1]]
lapply(res, head)
## tags; NM: edit distance; H1: 1-difference hits
p4 <- ScanBamParam(tag=c("NM", "H1"), what="flag")
bam4 <- scanBam(fl, param=p4)
str(bam4[[1]][["tag"]])
## tagFilter
p5 <- ScanBamParam(tag=c("NM", "H1"), tagFilter=list(NM=c(2, 3, 4)))
bam5 <- scanBam(fl, param=p5)
table(bam5[[1]][["tag"]][["NM"]])
## flag utils
flag <- scanBamFlag(isUnmappedQuery=FALSE, isMinusStrand=TRUE)
p6 <- ScanBamParam(what="flag")
bam6 <- scanBam(fl, param=p6)
flag6 <- bam6[[1]][["flag"]]
head(bamFlagAsBitMatrix(flag6[1:9]))
colSums(bamFlagAsBitMatrix(flag6))
flag
bamFlagAsBitMatrix(flag)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(Rsamtools)
Loading required package: GenomeInfoDb
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: Biostrings
Loading required package: XVector
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/Rsamtools/ScanBamParam-class.Rd_%03d_medium.png", width=480, height=480)
> ### Name: ScanBamParam
> ### Title: Parameters for scanning BAM files
> ### Aliases: ScanBamParam-class ScanBamParam ScanBamParam,missing-method
> ### ScanBamParam,ANY-method ScanBamParam,RangesList-method
> ### ScanBamParam,GRanges-method scanBamWhat scanBamFlag bamFlag<- bamFlag
> ### bamReverseComplement<- bamReverseComplement bamSimpleCigar<-
> ### bamSimpleCigar bamTag<- bamTag bamTagFilter bamTagFilter<- bamWhat<-
> ### bamWhat bamWhich<- bamWhich<-,ScanBamParam,GRanges-method
> ### bamWhich<-,ScanBamParam,RangesList-method
> ### bamWhich<-,ScanBamParam,ANY-method bamWhich bamMapqFilter
> ### bamMapqFilter<- show,ScanBamParam-method FLAG_BITNAMES
> ### bamFlagAsBitMatrix bamFlagAND bamFlagTest isPaired isProperPair
> ### isUnmappedQuery hasUnmappedMate isMinusStrand isMateMinusStrand
> ### isFirstMateRead isSecondMateRead isNotPrimaryRead
> ### isSecondaryAlignment isNotPassingQualityControls isDuplicate
> ### Keywords: classes
>
> ### ** Examples
>
> ## defaults
> p0 <- ScanBamParam()
>
> ## subset of reads based on genomic coordinates
> which <- RangesList(seq1=IRanges(1000, 2000),
+ seq2=IRanges(c(100, 1000), c(1000, 2000)))
> p1 <- ScanBamParam(what=scanBamWhat(), which=which)
>
> ## subset of reads based on 'flag' value
> p2 <- ScanBamParam(what=scanBamWhat(),
+ flag=scanBamFlag(isMinusStrand=FALSE))
>
> ## subset of fields
> p3 <- ScanBamParam(what=c("rname", "strand", "pos", "qwidth"))
>
> ## use
> fl <- system.file("extdata", "ex1.bam", package="Rsamtools",
+ mustWork=TRUE)
> res <- scanBam(fl, param=p2)[[1]]
> lapply(res, head)
$qname
[1] "B7_591:4:96:693:509" "EAS54_65:7:152:368:113" "EAS51_64:8:5:734:57"
[4] "B7_591:1:289:587:906" "EAS56_59:8:38:671:758" "EAS56_61:6:18:467:281"
$flag
[1] 73 73 137 137 137 73
$rname
[1] seq1 seq1 seq1 seq1 seq1 seq1
Levels: seq1 seq2
$strand
[1] + + + + + +
Levels: + - *
$pos
[1] 1 3 5 6 9 13
$qwidth
[1] 36 35 35 36 35 35
$mapq
[1] 99 99 99 63 99 99
$cigar
[1] "36M" "35M" "35M" "36M" "35M" "35M"
$mrnm
[1] <NA> <NA> <NA> <NA> <NA> <NA>
Levels: seq1 seq2
$mpos
[1] NA NA NA NA NA NA
$isize
[1] NA NA NA NA NA NA
$seq
A DNAStringSet instance of length 6
width seq
[1] 36 CACTAGTGGCTCATTGTAAATGTGTGGTTTAACTCG
[2] 35 CTAGTGGCTCATTGTAAATGTGTGGTTTAACTCGT
[3] 35 AGTGGCTCATTGTAAATGTGTGGTTTAACTCGTCC
[4] 36 GTGGCTCATTGTAATTTTTTGTTTTAACTCTTCTCT
[5] 35 GCTCATTGTAAATGTGTGGTTTAACTCGTCCATGG
[6] 35 ATTGTAAATGTGTGGTTTAACTCGTCCCTGGCCCA
$qual
A PhredQuality instance of length 6
width seq
[1] 36 <<<<<<<<<<<<<<<;<<<<<<<<<5<<<<<;:<;7
[2] 35 <<<<<<<<<<0<<<<655<<7<<<:9<<3/:<6):
[3] 35 <<<<<<<<<<<7;71<<;<;;<7;<<3;);3*8/5
[4] 36 (-&----,----)-)-),'--)---',+-,),''*,
[5] 35 <<<<<<<<<<<<<<<;<;7<<<<<<<<7<<;:<5%
[6] 35 <<<<<<<<;<<<8<<<<<;8:;6/686&;(16666
>
> ## tags; NM: edit distance; H1: 1-difference hits
> p4 <- ScanBamParam(tag=c("NM", "H1"), what="flag")
> bam4 <- scanBam(fl, param=p4)
> str(bam4[[1]][["tag"]])
List of 2
$ NM: int [1:3307] 0 0 0 5 0 1 0 1 0 1 ...
$ H1: int [1:3307] 0 0 0 0 0 1 0 1 0 0 ...
>
> ## tagFilter
> p5 <- ScanBamParam(tag=c("NM", "H1"), tagFilter=list(NM=c(2, 3, 4)))
> bam5 <- scanBam(fl, param=p5)
> table(bam5[[1]][["tag"]][["NM"]])
2 3 4
104 45 22
>
> ## flag utils
> flag <- scanBamFlag(isUnmappedQuery=FALSE, isMinusStrand=TRUE)
>
> p6 <- ScanBamParam(what="flag")
> bam6 <- scanBam(fl, param=p6)
> flag6 <- bam6[[1]][["flag"]]
> head(bamFlagAsBitMatrix(flag6[1:9]))
isPaired isProperPair isUnmappedQuery hasUnmappedMate isMinusStrand
[1,] 1 0 0 1 0
[2,] 1 0 0 1 0
[3,] 1 0 0 1 0
[4,] 1 0 0 1 0
[5,] 1 0 0 1 0
[6,] 1 0 0 1 0
isMateMinusStrand isFirstMateRead isSecondMateRead isSecondaryAlignment
[1,] 0 1 0 0
[2,] 0 1 0 0
[3,] 0 0 1 0
[4,] 0 0 1 0
[5,] 0 0 1 0
[6,] 0 1 0 0
isNotPassingQualityControls isDuplicate
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 0 0
[5,] 0 0
[6,] 0 0
> colSums(bamFlagAsBitMatrix(flag6))
isPaired isProperPair
3307 3144
isUnmappedQuery hasUnmappedMate
36 127
isMinusStrand isMateMinusStrand
1641 1606
isFirstMateRead isSecondMateRead
1654 1653
isSecondaryAlignment isNotPassingQualityControls
0 0
isDuplicate
0
> flag
keep0 keep1
2031 2043
> bamFlagAsBitMatrix(flag)
isPaired isProperPair isUnmappedQuery hasUnmappedMate isMinusStrand
keep0 1 1 1 1 0
keep1 1 1 0 1 1
isMateMinusStrand isFirstMateRead isSecondMateRead isSecondaryAlignment
keep0 1 1 1 1
keep1 1 1 1 1
isNotPassingQualityControls isDuplicate
keep0 1 1
keep1 1 1
>
>
>
>
>
> dev.off()
null device
1
>