Named list with list elements containing data frames
representing samples. Data frame rows should represent read pileups observed
in sequencing data. Data frame column names must include "end" and "cov"
corresponding to the base end position and coverage of a pileup respectively.
Data within data frames must be on the same chromosome as the region of
interest, see details!
txdb
Object of class TxDb giving transcription meta data for a genome
assembly. See Bioconductor annotation packages.
gr
Object of class GRanges specifying the region of interest and
corresponding to a single gene. See Bioconductor package GRanges.
genome
Object of class BSgenome specifying the genome sequence of
interest. See Bioconductor annotation packages.
reduce
Boolean specifying whether to collapse gene isoforms within the
region of interest into one representative transcript. Experimental use with
caution!
gene_colour
Character string specifying the colour of the gene to be
plotted in the gene track.
gene_name
Character string specifying the name of the gene or region
of interest.
gene_plotLayer
Valid ggplot2 layer to be added to the gene sub-plot.
label_bgFill
Character string specifying the desired background colour
of the track labels.
label_txtFill
Character string specifying the desired text colour of
the rack labels.
label_borderFill
Character string specifying the desired border colour
of the track labels.
label_txtSize
Integer specifying the size of the text within the track
labels.
lab2plot_ratio
Numeric vector of length 2 specifying the ratio of
track labels to plot space.
cov_colour
Character string specifying the colour of the data in the
coverage plots.
cov_plotType
Character string specifying one of "line",
"bar" or "point". Changes the ggplot2 geom which constructs the data display.
cov_plotLayer
Valid ggplot2 layer to be added to the coverage
sub-plots.
base
Numeric vector of log bases to transform the data
corresponding to the elements supplied to the variable transform See details.
transform
Character vector specifying what objects to log transform,
accepts "Intron", "CDS", and "UTR" See details.
gene_labelTranscript
Boolean specifying whether to plot the transcript
names in the gene plot.
gene_labelTranscriptSize
Integer specifying the size of the transcript
name text in the gene plot.
gene_isoformSel
Character vector specifying the names
(from the txdb object) of isoforms within the region of interest to display.
out
Character vector specifying the object to output, one of
"data", "grob", or "plot", defaults to "plot" (see returns).
subsample
Boolean value specifying whether to reduce the provided
coverage data to a subset of approximately 1000 points. Used to generate
sparse plots that use less disk space and are faster to render.
Details
genCov is a function designed construct a series of tracks based on
a TxDb object giving transcript features, and coverage data supplied to
parameter 'x'. The function will look at a region of interest specified by
the argument supplied to gr and plot transcript features and the
corresponding coverage information. The argument supplied to 'genome' enables
gc content within genomic features to be calculated and displayed. The
argument supplied to x must contain data on the same chromosome as the region
of interest specified in the parameter 'gr'!
Typically, introns of a transcript are much larger than exons, while exons are
sometimes of greater interest. To address this, genCov will by default
scale the x-axis to expand track information according to region type: coding
sequence (CDS), untranslated region (UTR), or intron / intergenic (Intron).
The amount by which each region is scaled is controlled by the 'base' and
'transform' arguments. 'transform' specifies which regions to scale, and 'base'
corresponds to the log base transform to apply to those regions. To keep one or
more region types from being scaled, omit the corresponding entries from the
'base' and 'transform' vectors.
Value
One of the following, a list of dataframes containing data to be
plotted, a grob object, or a plot.
Examples
# Load transcript meta data
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
# Load BSgenome
library(BSgenome.Hsapiens.UCSC.hg19)
genome <- BSgenome.Hsapiens.UCSC.hg19
# Define a region of interest
gr <- GRanges(seqnames=c("chr10"),
ranges=IRanges(start=c(89622195), end=c(89729532)), strand=strand(c("+")))
# Create Data for input
start <- c(89622194:89729524)
end <- c(89622195:89729525)
chr <- 10
cov <- c(rnorm(100000, mean=40), rnorm(7331, mean=10))
cov_input_A <- as.data.frame(cbind(chr, start, end, cov))
start <- c(89622194:89729524)
end <- c(89622195:89729525)
chr <- 10
cov <- c(rnorm(50000, mean=40), rnorm(7331, mean=10), rnorm(50000, mean=40))
cov_input_A <- as.data.frame(cbind(chr, start, end, cov))
# Define the data as a list
data <- list("Sample A"=cov_input_A)
# Call genCov
genCov(data, txdb, gr, genome, gene_labelTranscriptSize=3)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(GenVisR)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/GenVisR/genCov.Rd_%03d_medium.png", width=480, height=480)
> ### Name: genCov
> ### Title: Construct a region of interest coverage plot
> ### Aliases: genCov
>
> ### ** Examples
>
> # Load transcript meta data
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
Loading required package: GenomicFeatures
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
>
> # Load BSgenome
> library(BSgenome.Hsapiens.UCSC.hg19)
Loading required package: BSgenome
Loading required package: Biostrings
Loading required package: XVector
Loading required package: rtracklayer
> genome <- BSgenome.Hsapiens.UCSC.hg19
>
> # Define a region of interest
> gr <- GRanges(seqnames=c("chr10"),
+ ranges=IRanges(start=c(89622195), end=c(89729532)), strand=strand(c("+")))
>
> # Create Data for input
> start <- c(89622194:89729524)
> end <- c(89622195:89729525)
> chr <- 10
> cov <- c(rnorm(100000, mean=40), rnorm(7331, mean=10))
> cov_input_A <- as.data.frame(cbind(chr, start, end, cov))
>
> start <- c(89622194:89729524)
> end <- c(89622195:89729525)
> chr <- 10
> cov <- c(rnorm(50000, mean=40), rnorm(7331, mean=10), rnorm(50000, mean=40))
> cov_input_A <- as.data.frame(cbind(chr, start, end, cov))
>
> # Define the data as a list
> data <- list("Sample A"=cov_input_A)
>
> # Call genCov
> genCov(data, txdb, gr, genome, gene_labelTranscriptSize=3)
Obtaining CDS Coordinates
'select()' returned 1:many mapping between keys and columns
Obtaining UTR Coordinates
'select()' returned 1:many mapping between keys and columns
Calculating transform
Mapping coverage data onto transformed gene-space
NULL
>
>
>
>
>
> dev.off()
null device
1
>