The funcion produces a base-pair resolution matrix of scores for given equal
width windows of interest. The returned matrix can be used to
draw meta profiles or heatmap of read coverage or wig track-like data.
The windows argument can be a predefined region around transcription start sites
or other regions of interest that have equal lengths
The function removes all window that fall off the Rle object -
have the start coordinate < 1 or end coordinate > length(Rle)
The function takes the intersection of names in the Rle and GRanges objects.
On Windows OS the function will give an error if the target is a file in .bigWig format.
RleList , GRanges, a BAM file or a BigWig
to be overlapped with ranges in windows
windows
GRanges object that contains the windows of interest.
It could be promoters, CpG islands, exons, introns.
However the sizes of windows have to be equal.
strand.aware
If TRUE (default: FALSE), the strands of the
windows will be taken into account in the resulting
ScoreMatrix.
If the strand of a window is -, the values of the bins
for that window will be reversed
weight.col
if the object is GRanges object a numeric column
in meta data part can be used as weights. This is particularly
useful when genomic regions have scores other than their
coverage values, such as percent methylation, conservation
scores, GC content, etc.
is.noCovNA
(Default:FALSE)
if TRUE,and if 'target' is a GRanges object with 'weight.col'
provided, the bases that are uncovered will be preserved as
NA in the returned object. This useful for situations where
you can not have coverage all over the genome, such as CpG
methylation values.
type
if target is a character vector of file paths, then type designates
the type of the corresponding files (bam or bigWig)
rpm
boolean telling whether to normalize the coverage to per milion
reads. FALSE by default. See library.size.
unique
boolean which tells the function to remove duplicated reads
based on chr, start, end and strand
extend
numeric which tells the function to extend the reads to width=extend
param
ScanBamParam object
bam.paired.end
boolean indicating whether given BAM file contains
paired-end reads (default:FALSE).
Paired-reads will be treated as fragments.
library.size
numeric indicating total number of mapped reads in a BAM file
(rpm has to be set to TRUE).
If is not given (default: NULL) then library size
is calculated using the Rsamtools package functions:
sum(countBam(BamFile(target))$records).
Value
returns a ScoreMatrix object
Note
We assume that a paired-end BAM file contains reads with unique ids and we remove
both mates of reads if they are repeated. Due to the fact that ScoreMatrix
uses the GenomicAlignments:readGAlignmentPairs function to read paired-end BAM files
a duplication of reads occurs when mates of one pair map into two different windows.
Strands of reads in a paired-end BAM are inferred depending on strand of
first alignment from the pair. This is a default setting in the
GenomicAlignments:readGAlignmentPairs function (see a strandMode argument).
This mode should be used when the paired-end data was generated using
one of the following stranded protocols:
Directional Illumina (Ligation), Standard SOLiD.
See Also
ScoreMatrixBin
Examples
# When target is GRanges
data(cage)
data(promoters)
scores1=ScoreMatrix(target=cage,windows=promoters,strand.aware=TRUE,
weight.col="tpm")
# When target is RleList
library(GenomicRanges)
covs = coverage(cage)
scores2 = ScoreMatrix(target=covs,windows=promoters,strand.aware=TRUE)
scores2
# When target is a bam file
bam.file = system.file('unitTests/test.bam', package='genomation')
windows = GRanges(rep(c(1,2),each=2), IRanges(rep(c(1,2), times=2), width=5))
scores3 = ScoreMatrix(target=bam.file,windows=windows, type='bam')
scores3
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(genomation)
Loading required package: grid
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/genomation/ScoreMatrix-methods.Rd_%03d_medium.png", width=480, height=480)
> ### Name: ScoreMatrix
> ### Title: Get base-pair score for bases in each window
> ### Aliases: ScoreMatrix ScoreMatrix,GRanges,GRanges-method
> ### ScoreMatrix,RleList,GRanges-method
> ### ScoreMatrix,character,GRanges-method
>
> ### ** Examples
>
> # When target is GRanges
> data(cage)
> data(promoters)
> scores1=ScoreMatrix(target=cage,windows=promoters,strand.aware=TRUE,
+ weight.col="tpm")
>
>
> # When target is RleList
> library(GenomicRanges)
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
> covs = coverage(cage)
> scores2 = ScoreMatrix(target=covs,windows=promoters,strand.aware=TRUE)
> scores2
scoreMatrix with dims: 1055 2001
>
> # When target is a bam file
> bam.file = system.file('unitTests/test.bam', package='genomation')
> windows = GRanges(rep(c(1,2),each=2), IRanges(rep(c(1,2), times=2), width=5))
> scores3 = ScoreMatrix(target=bam.file,windows=windows, type='bam')
> scores3
scoreMatrix with dims: 4 5
>
>
>
>
>
> dev.off()
null device
1
>