Plots density of motif occurrences in an ordered set of sequences of the same
length in the form of a two dimensional map centered at a common reference
position. Motif is specified by a position weight matrix (PWM) that contains
estimated probability of base b at position i, and only motif hits above
specified threshold are taken into account and plotted.
A DNAStringSet object. Set of sequences of the same length
for which the motif occurrence density should be visualised.
motifPWM
A numeric matrix representing the Position Weight Matrix (PWM), such as
returned by PWM function. Can contain either probabilities
or log2 probability ratio of base b at position i.
minScore
The minimum score for counting a motif hit. Can be given as a character
string containing a percentage (e.g."85%") of the
PWM score or a single number specifying score threshold. If a percentage
is given, it is converted to a score value taking into account both
minimal and maximal possible PWM scores as follows:
minPWMscore + percThreshold/100 * (maxPWMscore - minPWMscore)
This differs from the formula in the matchPWM function
from the Biostrings package which takes into account only the
maximal possible PWM score and considers the given percentage as the
percentage of that maximal score:
percThreshold/100 * maxPWMscore
seqOrder
Integer vector specifying the order of the provided input sequences.
Must have the same length as the number of sequences in the
regionSeq. Input sequences will be sorted according to this index
in an ascending order form top to the bottom of the plot, i.e.
the sequence labeled with the lowest number will appear at the top of
the plot. The default value will order the sequences as they are ordered
in the input regionSeq object.
flankUp, flankDown
The number of base-pairs upstream and downstream of the reference
position in the provided sequences, respectively.
flankUp + flankDown must sum up to the length of the sequences.
If no values are provided both flankUp and flankDown are
set to be half of the length of the input sequences, i.e. the
reference position is assumed to be in the middle of the sequences.
nBin
Numeric vector with two values containing the number of equally spaced
points in each direction over which the density is to be estimated. The
first value specifies number of bins along x-axis, i.e. along the
nucleotides in the sequence, and the second value specifies the number
of bins along y-axis, i.e. across ordered input sequences. The
values are passed on to the gridsize argument of the
bkde2D function to compute a 2D binned kernel density
estimate. If nBin is not specified it will default to
c(n, m), where n is the number of input sequences and
m is the length of sequences.
bandWidth
Numeric vector of length 2, containing the bandwidth to be used in each
coordinate direction. The first value specifies the bandwidth along the
x-axis, i.e. along the nucleotides in the sequence, and the
second value specifies the bandwidth along y-axis, i.e. across
ordered input sequences. The values are passed on to the
bandwidth argument of the bkde2D function to
compute a 2D binned kernel density estimate and are used as standard
deviation of the bivariate Gaussian kernel. If bandWidth is not
specified it will default to c(3,3).
color
Character specifying the color palette for the density plot. One of the
following color palettes can be specified: "blue", "brown",
"cyan", "gold", "gray", "green", "pink", "purple", "red". Please refer
to the vignette for the appearance of these palettes.
transf
The function mapping the density scale to the color scale. See Details.
xTicks
Character vector of labels to be placed at the tick-marks on x-axis.
The default NULL value produces five tick-marks: one at the
reference point and two equally spaced tick-marks both upstream and
downstream of the reference point.
xTicksAt
Numeric vector of positions of the tick-marks on the x-axis. The values
can range from 1 (the position of the first base-pair in the sequence)
to input sequence length. The default NULL value produces five
tick-marks: one at the reference point and two equally spaced tick-marks
both upstream and downstream of the reference point.
xLabel
The label for the x-axis. The default is no label, i.e. empty
string.
yTicks
Character vector of labels to be placed at the tick-marks on y-axis.
The default NULL value produces no tick-marks and labels.
yTicksAt
Numeric vector of positions of the tick-marks on the y-axis. The values
can range from 1 (the position of the last sequence on the bottom of the
plot) to input sequence length (the position of the first sequence on
the top of the plot). The default NULL value produces no
tick-marks.
yLabel
The label for the y-axis. The default is no label, i.e. empty
string.
cexAxis
The magnification to be used for axis annotation.
plotScale
Logical, should the scale bar be plotted in the lower left corner of the
plot.
scaleLength
The length of the scale bar to be plotted. Used only when
plotScale = TRUE. If no value is provided, it defaults to one
fifth of the input sequence length.
scaleWidth
The width of the line for the scale bar. Used only when
plotScale = TRUE.
addReferenceLine
Logical, should the vertical dashed line be drawn at the reference
point.
plotColorLegend
Logical, should the color legend for the pattern density be plotted. If
TRUE a separate .png file named outFile."ColorLegend.png"
will be created, showing mapping of pattern density values to colours.
outFile
Character vector specifying the base name of the output plot file. The
final name of the plot file for each pattern will be
outFile."pattern.jpg".
plotWidth, plotHeight
Width and height of the density plot in pixels.
Value
The function produces a PNG file in the working directory, visualising
density of the motif occurrence above specified threshold in the set of
ordered input sequences.
Author(s)
Vanja Haberle
References
Haberle et al. (2014) Two independent transcription initiation codes
overlap on vertebrate core promoters, Nature507:381-385.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(seqPattern)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/seqPattern/plotMotifDensityMap.Rd_%03d_medium.png", width=480, height=480)
> ### Name: plotMotifDensityMap
> ### Title: Plotting density maps of motif occurrence
> ### Aliases: plotMotifDensityMap
> ### plotMotifDensityMap,DNAStringSet,matrix-method
>
> ### ** Examples
>
> library(GenomicRanges)
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
> load(system.file("data", "zebrafishPromoters.RData", package="seqPattern"))
> promoterWidth <- elementMetadata(zebrafishPromoters)$interquantileWidth
>
> load(system.file("data", "TBPpwm.RData", package="seqPattern"))
>
> plotMotifDensityMap(regionsSeq = zebrafishPromoters, motifPWM = TBPpwm,
+ minScore = "85%", seqOrder = order(promoterWidth),
+ flankUp = 400, flankDown = 600, color = "red")
Getting motif occurrence matrix...
Calculating density...
->motif
Plotting...
->motif
There were 12 warnings (use warnings() to see them)
>
>
>
>
>
> dev.off()
null device
1
>