R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Plotting density maps of sequence pattern occurrence

plotPatternDensityMap

R Documentation

Plotting density maps of sequence pattern occurrence

Description

Plots density of sequence pattern occurrences in an ordered set of sequences of the same length in the form of a two dimensional map centered at a common reference position. Multiple sequence patterns can be processed at once and one plot per pattern will be created with the same color scale across all plots, allowing visual density comparison across different patterns.

Usage

plotPatternDensityMap(regionsSeq, patterns, seqOrder = c(1:length(regionsSeq)),
    flankUp = NULL, flankDown = NULL, nBin = NULL, bandWidth = NULL, 
    color = "blue", transf = NULL, xTicks = NULL, xTicksAt = NULL, xLabel = "", 
    yTicks = NULL, yTicksAt = NULL, yLabel = "", cexAxis = 8, plotScale = TRUE,
    scaleLength = NULL, scaleWidth = 15, addPatternLabel = TRUE, cexLabel = 8,
    labelCol = "black", addReferenceLine = TRUE, plotColorLegend = TRUE,
    outFile = "PatternDensityMap", plotWidth = 2000, plotHeight = 2000,
    useMulticore = FALSE, nrCores = NULL)

Arguments

`regionsSeq`	A `DNAStringSet` object. Set of sequences of the same length for which the patterns occurrence density should be visualised.
`patterns`	Character vector specifying one or more DNA sequence patterns (oligonucleotides). IUPAC ambiguity codes can be used and will match any letter in the subject that is associated with the code.
`seqOrder`	Integer vector specifying the order of the provided input sequences. Must have the same length as the number of sequences in the `regionSeq`. Input sequences will be sorted according to this index in an ascending order form top to the bottom of the plot, i.e. the sequence labeled with the lowest number will appear at the top of the plot. The default value will order the sequences as they are ordered in the input `regionSeq` object.
`flankUp, flankDown`	The number of base-pairs upstream and downstream of the reference position in the provided sequences, respectively. `flankUp + flankDown` must sum up to the length of the sequences. If no values are provided both `flankUp` and `flankDown` are set to be half of the length of the input sequences, i.e. the reference position is assumed to be in the middle of the sequences.
`nBin`	Numeric vector with two values containing the number of equally spaced points in each direction over which the density is to be estimated. The first value specifies number of bins along x-axis, i.e. along the nucleotides in the sequence, and the second value specifies the number of bins along y-axis, i.e. across ordered input sequences. The values are passed on to the `gridsize` argument of the `bkde2D` function to compute a 2D binned kernel density estimate. If `nBin` is not specified it will default to `c(n, m)`, where `n` is the number of input sequences and `m` is the length of sequences.
`bandWidth`	Numeric vector of length 2, containing the bandwidth to be used in each coordinate direction. The first value specifies the bandwidth along the x-axis, i.e. along the nucleotides in the sequence, and the second value specifies the bandwidth along y-axis, i.e. across ordered input sequences. The values are passed on to the `bandwidth` argument of the `bkde2D` function to compute a 2D binned kernel density estimate and are used as standard deviation of the bivariate Gaussian kernel. If `bandWidth` is not specified it will default to `c(3,3)`.
`color`	Character specifying the color palette for the density plot. One of the following color palettes can be specified: `"blue", "brown", "cyan", "gold", "gray", "green", "pink", "purple", "red"`. Please refer to the vignette for the appearance of these palettes.
`transf`	The function mapping the density scale to the color scale. See Details.
`xTicks`	Character vector of labels to be placed at the tick-marks on x-axis. The default `NULL` value produces five tick-marks: one at the reference point and two equally spaced tick-marks both upstream and downstream of the reference point.
`xTicksAt`	Numeric vector of positions of the tick-marks on the x-axis. The values can range from 1 (the position of the first base-pair in the sequence) to input sequence length. The default `NULL` value produces five tick-marks: one at the reference point and two equally spaced tick-marks both upstream and downstream of the reference point.
`xLabel`	The label for the x-axis. The default is no label, i.e. empty string.
`yTicks`	Character vector of labels to be placed at the tick-marks on y-axis. The default `NULL` value produces no tick-marks and labels.
`yTicksAt`	Numeric vector of positions of the tick-marks on the y-axis. The values can range from 1 (the position of the last sequence on the bottom of the plot) to input sequence length (the position of the first sequence on the top of the plot). The default `NULL` value produces no tick-marks.
`yLabel`	The label for the y-axis. The default is no label, i.e. empty string.
`cexAxis`	The magnification to be used for axis annotation.
`plotScale`	Logical, should the scale bar be plotted in the lower left corner of the plot.
`scaleLength`	The length of the scale bar to be plotted. Used only when `plotScale = TRUE`. If no value is provided, it defaults to one fifth of the input sequence length.
`scaleWidth`	The width of the line for the scale bar. Used only when `plotScale = TRUE`.
`addPatternLabel`	Logical, should the pattern label be written in the upper left corner of the plot.
`cexLabel`	The magnification to be used for pattern label.
`labelCol`	The color to be used for pattern label and scale bar.
`addReferenceLine`	Logical, should the vertical dashed line be drawn at the reference point.
`plotColorLegend`	Logical, should the color legend for the pattern density be plotted. If `TRUE` a separate .png file named `outFile`."ColorLegend.png" will be created, showing mapping of pattern density values to colours.
`outFile`	Character vector specifying the base name of the output plot file. The final name of the plot file for each pattern will be `outFile`."pattern.png".
`plotWidth, plotHeight`	Width and height of the density plot(s) in pixels.
`useMulticore`	Logical, should multicore be used. `useMulticore = TRUE` is supported only on Unix-like platforms.
`nrCores`	Number of cores to use when `useMulticore = TRUE`. Default value `NULL` uses all detected cores.

Value

The function produces PNG files in the working directory, visualising density of patterns occurrence in the set of ordered input sequences. One file/plot per specified pattern is created.

Author(s)

Vanja Haberle

References

Haberle et al. (2014) Two independent transcription initiation codes overlap on vertebrate core promoters, Nature 507:381-385.

Examples

library(GenomicRanges)
load(system.file("data", "zebrafishPromoters.RData", package="seqPattern"))

promoterWidth <- elementMetadata(zebrafishPromoters)$interquantileWidth

# dinucleotide patterns
plotPatternDensityMap(regionsSeq = zebrafishPromoters, patterns = c("TA", "GC"),
            seqOrder = order(promoterWidth), flankUp = 400, flankDown = 600, 
            color = "blue")

# motif consensus sequence
plotPatternDensityMap(regionsSeq = zebrafishPromoters, patterns = "TATAWAWR",
            seqOrder = order(promoterWidth), flankUp = 400, flankDown = 600,
            color = "cyan")

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(seqPattern)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/seqPattern/plotPatternDensityMap.Rd_%03d_medium.png", width=480, height=480)
> ### Name: plotPatternDensityMap
> ### Title: Plotting density maps of sequence pattern occurrence
> ### Aliases: plotPatternDensityMap
> ###   plotPatternDensityMap,DNAStringSet-method
> 
> ### ** Examples
> 
> library(GenomicRanges)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
> load(system.file("data", "zebrafishPromoters.RData", package="seqPattern"))
> 
> promoterWidth <- elementMetadata(zebrafishPromoters)$interquantileWidth
> 
> # dinucleotide patterns
> plotPatternDensityMap(regionsSeq = zebrafishPromoters, patterns = c("TA", "GC"),
+             seqOrder = order(promoterWidth), flankUp = 400, flankDown = 600, 
+             color = "blue")

Getting oligonucleotide occurrence matrix...

Calculating density...
->TA
->GC

Plotting...
->TA
->GC
> 
> # motif consensus sequence
> plotPatternDensityMap(regionsSeq = zebrafishPromoters, patterns = "TATAWAWR",
+             seqOrder = order(promoterWidth), flankUp = 400, flankDown = 600,
+             color = "cyan")

Getting oligonucleotide occurrence matrix...

Calculating density...
->TATAWAWR

Plotting...
->TATAWAWR
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>