The GRanges class is a container for the genomic locations and
their associated annotations.
Details
GRanges is a vector of genomic locations and associated
annotations. Each element in the vector is comprised of a sequence name,
an interval, a strand, and optional metadata columns (e.g. score, GC
content, etc.). This information is stored in four components:
seqnames
a 'factor' Rle object
containing the sequence names.
ranges
an IRanges object containing
the ranges.
strand
a 'factor' Rle object containing
the strand information.
mcols
a DataFrame object
containing the metadata columns. Columns cannot be named
"seqnames", "ranges", "strand",
"seqlevels", "seqlengths", "isCircular",
"start", "end", "width", or "element".
seqinfo
a Seqinfo object containing information
about the set of genomic sequences present in the GRanges object.
Constructor
GRanges(seqnames=NULL, ranges=NULL, strand=NULL,
..., seqlengths=NULL, seqinfo=NULL):
Creates a GRanges object.
seqnames
NULL, or an Rle object, character vector,
or factor containing the sequence names.
ranges
NULL, or an IRanges object containing the
ranges.
strand
NULL, or an Rle object, character vector,
or factor containing the strand information.
...
Optional metadata columns. These columns cannot be named
"start", "end", "width", or
"element".
seqlengths
NULL, or an integer vector named with levels(seqnames)
and containing the lengths (or NA) for each level in
levels(seqnames).
seqinfo
NULL, or a Seqinfo object containing
allowed sequence names, lengths (or NA), and circularity flag,
for each level in levels(seqnames).
If ranges is not supplied and/or NULL then the constructor
proceeds in 2 steps:
An initial GRanges object is created with
as(seqnames, "GRanges").
Then this GRanges object is updated according to whatever
non-NULL remaining arguments were passed to the call to
GRanges().
As a consequence of this behavior, GRanges(x) is equivalent to
as(x, "GRanges").
Coercion
In the code snippets below, x is a GRanges object.
as(from, "GRanges"): Creates a GRanges object from a character
vector, a factor, or a RangedData, or RangesList object.
When from is a character vector (or a factor), each element
in it must represent a genomic range in format chr1:2501-2800
(unstranded range) or chr1:2501-2800:+ (stranded range).
.. is also supported as a separator between the start and end
positions. Strand can be +, -, *, or missing.
The names on from are propagated to the returned GRanges object.
See as.character() and as.factor() below for the
reverse transformations.
Coercing a data.frame or DataFrame into a GRanges object is also
supported. See makeGRangesFromDataFrame for the details.
as(from, "RangedData"):
Creates a RangedData object from a GRanges
object. The strand and metadata columns become columns
in the result. The seqlengths(from), isCircular(from),
and genome(from) vectors are stored in the metadata columns
of ranges(rd).
as(from, "RangesList"):
Creates a RangesList object from a GRanges
object. The strand and metadata columns become inner
metadata columns (i.e. metadata columns on the ranges).
The seqlengths(from), isCircular(from), and
genome(from) vectors become the metadata columns.
as.character(x, ignore.strand=FALSE):
Turn GRanges object x into a character vector where each
range in x is represented by a string in format
chr1:2501-2800:+. If ignore.strand is TRUE or if
all the ranges in x are unstranded (i.e. their strand
is set to *), then all the strings in the output are in
format chr1:2501-2800.
The names on x are propagated to the returned character vector.
Its metadata (metadata(x)) and metadata columns (mcols(x))
are ignored.
See as(from, "GRanges") above for the reverse transformation.
See as(from, "GRanges") above for the reverse transformation.
Note that table(x) is supported on a GRanges object. It is
equivalent to, but much faster than, table(as.factor(x)).
as.data.frame(x, row.names = NULL, optional = FALSE, ...):
Creates a data.frame with columns seqnames (factor),
start (integer), end (integer), width (integer),
strand (factor), as well as the additional metadata columns
stored in mcols(x). Pass an explicit
stringsAsFactors=TRUE/FALSE argument via ... to
override the default conversions for the metadata columns in
mcols(x).
In the code snippets below, x is a Seqinfo
object.
as(x, "GRanges"), as(x, "GenomicRanges"),
as(x, "RangesList"): Turns Seqinfo object
x (with no NA lengths) into a GRanges or RangesList.
Accessors
In the following code snippets, x is a GRanges object.
length(x):
Get the number of elements.
seqnames(x), seqnames(x) <- value:
Get or set the sequence names.
value can be an Rle object, a character vector,
or a factor.
ranges(x), ranges(x) <- value:
Get or set the ranges. value can be a Ranges object.
names(x), names(x) <- value:
Get or set the names of the elements.
strand(x), strand(x) <- value:
Get or set the strand. value can be an Rle object, character
vector, or factor.
mcols(x, use.names=FALSE), mcols(x) <- value:
Get or set the metadata columns.
If use.names=TRUE and the metadata columns are not NULL,
then the names of x are propagated as the row names of the
returned DataFrame object.
When setting the metadata columns, the supplied value must be NULL
or a data.frame-like object (i.e. DataTable or data.frame)
object holding element-wise metadata.
elementMetadata(x), elementMetadata(x) <- value,
values(x), values(x) <- value:
Alternatives to mcols functions. Their use is discouraged.
seqinfo(x), seqinfo(x) <- value:
Get or set the information about the underlying sequences.
value must be a Seqinfo object.
seqlevels(x), seqlevels(x, force=FALSE) <- value:
Get or set the sequence levels.
seqlevels(x) is equivalent to seqlevels(seqinfo(x))
or to levels(seqnames(x)), those 2 expressions being
guaranteed to return identical character vectors on a GRanges object.
value must be a character vector with no NAs.
See ?seqlevels for more information.
seqlengths(x), seqlengths(x) <- value:
Get or set the sequence lengths.
seqlengths(x) is equivalent to seqlengths(seqinfo(x)).
value can be a named non-negative integer or numeric vector
eventually with NAs.
isCircular(x), isCircular(x) <- value:
Get or set the circularity flags.
isCircular(x) is equivalent to isCircular(seqinfo(x)).
value must be a named logical vector eventually with NAs.
genome(x), genome(x) <- value:
Get or set the genome identifier or assembly name for each sequence.
genome(x) is equivalent to genome(seqinfo(x)).
value must be a named character vector eventually with NAs.
seqlevelsStyle(x), seqlevelsStyle(x) <- value:
Get or set the seqname style for x.
See the seqlevelsStyle generic getter and setter
in the GenomeInfoDb package for more information.
score(x), score(x) <- value: Get or set the “score”
column from the element metadata.
granges(x, use.mcols=FALSE): Gets a GRanges with
only the range information from x, unless use.mcols
is TRUE, in which case the metadata columns are also
returned. Those columns will include any "extra column slots" if
x is a specialized GenomicRanges derivative.
Ranges methods
In the following code snippets, x is a GRanges object.
start(x), start(x) <- value:
Get or set start(ranges(x)).
end(x), end(x) <- value:
Get or set end(ranges(x)).
width(x), width(x) <- value:
Get or set width(ranges(x)).
Splitting and Combining
In the code snippets below, x is a GRanges object.
append(x, values, after = length(x)):
Inserts the values into x at the position given by
after, where x and values are of the same
class.
c(x, ...):
Combines x and the GRanges objects in ... together.
Any object in ... must belong to the same class as x,
or to one of its subclasses, or must be NULL.
The result is an object of the same class as x.
c(x, ..., ignore.mcols=FALSE)
If the GRanges objects have metadata columns (represented as one
DataFrame per object), each such DataFrame must have the
same columns in order to combine successfully. In order to circumvent
this restraint, you can pass in an ignore.mcols=TRUE argument
which will combine all the objects into one and drop all of their
metadata columns.
split(x, f, drop=FALSE):
Splits x according to f to create a
GRangesList object.
If f is a list-like object then drop is ignored
and f is treated as if it was
rep(seq_len(length(f)), sapply(f, length)),
so the returned object has the same shape as f (it also
receives the names of f).
Otherwise, if f is not a list-like object, empty list
elements are removed from the returned object if drop is
TRUE.
Subsetting
In the code snippets below, x is a GRanges object.
x[i, j], x[i, j] <- value:
Get or set elements i with optional metadata columns
mcols(x)[,j], where i can be missing; an NA-free
logical, numeric, or character vector; or a 'logical' Rle object.
x[i, j] <- value:
Replaces elements i and optional metadata columns j
with value.
head(x, n = 6L):
If n is non-negative, returns the first n elements of the
GRanges object.
If n is negative, returns all but the last abs(n) elements
of the GRanges object.
rep(x, times, length.out, each):
Repeats the values in x through one of the following conventions:
times
Vector giving the number of times to repeat each
element if of length length(x), or to repeat the whole vector
if of length 1.
length.out
Non-negative integer. The desired length of
the output vector.
each
Non-negative integer. Each element of x is
repeated each times.
subset(x, subset):
Returns a new object of the same class as x made of the subset
using logical vector subset, where missing values are taken as
FALSE.
tail(x, n = 6L):
If n is non-negative, returns the last n elements of the
GRanges object.
If n is negative, returns all but the first abs(n) elements
of the GRanges object.
window(x, start = NA, end = NA, width = NA, frequency = NULL, delta = NULL, ...):
Extracts the subsequence window from the GRanges object using:
start, end, width
The start, end, or width
of the window. Two of the three are required.
frequency, delta
Optional arguments that specify
the sampling frequency and increment within the window.
In general, this is more efficient than using "[" operator.
window(x, start = NA, end = NA, width = NA, keepLength = TRUE) <- value:
Replaces the subsequence window specified on the left (i.e. the subsequence
in x specified by start, end and width)
by value.
value must either be of class class(x), belong to a subclass
of class(x), be coercible to class(x), or be NULL.
If keepLength is TRUE, the elements of value are
repeated to create a GRanges object with the same number of elements
as the width of the subsequence window it is replacing.
If keepLength is FALSE, this replacement method can modify
the length of x, depending on how the length of the left
subsequence window compares to the length of value.
x$name, x$name <- value:
Shortcuts for mcols(x)$name and mcols(x)$name <- value,
respectively. Provided as a convenience, for GRanges objects *only*,
and as the result of strong popular demand.
Note that those methods are not consistent with the other $
and $<- methods in the IRanges/GenomicRanges infrastructure,
and might confuse some users by making them believe that a GRanges
object can be manipulated as a data.frame-like object.
Therefore we recommend using them only interactively, and we discourage
their use in scripts or packages. For the latter, use
mcols(x)$name and mcols(x)$name <- value, instead
of x$name and x$name <- value, respectively.
Note that a GRanges object can be used to as a subscript to subset a
list-like object that has names on it. In that case, the names on the
list-like object are interpreted as sequence names.
In the code snippets below, x is a list or List object with
names on it, and the subscript gr is a GRanges object with all its
seqnames being valid x names.
x[gr]:
Return an object of the same class as x and parallel
to gr. More precisely, it's conceptually doing:
show(x):
By default the show method displays 5 head and 5 tail
elements. This can be changed by setting the global options
showHeadLines and showTailLines. If the object
length is less than (or equal to) the sum of these 2 options
plus 1, then the full object is displayed.
Note that these options also affect the display of
GAlignments and
GAlignmentPairs objects (defined in
the GenomicAlignments package), as well as other objects
defined in the IRanges and Biostrings packages (e.g.
IRanges and DNAStringSet objects).
Author(s)
P. Aboyoun and H. Pag<c3><83><c2><a8>s
See Also
makeGRangesFromDataFrame for making a GRanges object
from a data.frame or DataFrame object.
seqinfo for accessing/modifying
information about the underlying sequences of a GRanges object.
The GPos class, a memory-efficient container for storing
genomic positions, that is, genomic ranges of width 1.
GenomicRanges-comparison for comparing and ordering genomic
ranges.
findOverlaps-methods for finding/counting
overlapping genomic ranges.
intra-range-methods and
inter-range-methods for intra range and
inter range transformations of a GRanges object.
coverage-methods for computing the coverage
of a GRanges object.
setops-methods for set operations on GRanges
objects.
nearest-methods for finding the nearest
genomic range neighbor.
absoluteRanges for transforming genomic ranges into
absolute ranges (i.e. into ranges on the sequence obtained
by virtually concatenating all the sequences in a genome).
tileGenome for putting tiles on a genome.
genomicvars for manipulating genomic variables.
GRangesList objects.
Ranges objects in the IRanges package.
Vector, Rle, and
DataFrame objects in the S4Vectors package.
Examples
## ---------------------------------------------------------------------
## CONSTRUCTION
## ---------------------------------------------------------------------
## Specifying the bare minimum i.e. seqnames and ranges only. The
## GRanges object will have no names, no strand information, and no
## metadata columns:
gr0 <- GRanges(Rle(c("chr2", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
IRanges(1:10, width=10:1))
gr0
## Specifying names, strand, metadata columns. They can be set on an
## existing object:
names(gr0) <- head(letters, 10)
strand(gr0) <- Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2))
mcols(gr0)$score <- 1:10
mcols(gr0)$GC <- seq(1, 0, length=10)
gr0
## ... or specified at construction time:
gr <- GRanges(Rle(c("chr2", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
IRanges(1:10, width=10:1, names=head(letters, 10)),
Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)),
score=1:10, GC=seq(1, 0, length=10))
stopifnot(identical(gr0, gr))
## Specifying the seqinfo. It can be set on an existing object:
seqinfo <- Seqinfo(paste0("chr", 1:3), c(1000, 2000, 1500), NA, "mock1")
seqinfo(gr0) <- merge(seqinfo(gr0), seqinfo)
seqlevels(gr0) <- seqlevels(seqinfo)
## ... or specified at construction time:
gr <- GRanges(Rle(c("chr2", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
IRanges(1:10, width=10:1, names=head(letters, 10)),
Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)),
score=1:10, GC=seq(1, 0, length=10),
seqinfo=seqinfo)
stopifnot(identical(gr0, gr))
## ---------------------------------------------------------------------
## COERCION
## ---------------------------------------------------------------------
## From GRanges:
as.character(gr)
as.factor(gr)
as.data.frame(gr)
## From character to GRanges:
x1 <- "chr2:56-125"
as(x1, "GRanges")
as(rep(x1, 4), "GRanges")
x2 <- c(A=x1, B="chr1:25-30:-")
as(x2, "GRanges")
## From data.frame to GRanges:
df <- data.frame(chrom="chr2", start=11:15, end=20:24)
gr3 <- as(df, "GRanges")
## Alternatively, coercion to GRanges can be done by just calling the
## GRanges() constructor on the object to coerce:
gr1 <- GRanges(x1) # same as as(x1, "GRanges")
gr2 <- GRanges(x2) # same as as(x2, "GRanges")
gr3 <- GRanges(df) # same as as(df, "GRanges")
## Sanity checks:
stopifnot(identical(as(x1, "GRanges"), gr1))
stopifnot(identical(as(x2, "GRanges"), gr2))
stopifnot(identical(as(df, "GRanges"), gr3))
## ---------------------------------------------------------------------
## SUMMARIZING ELEMENTS
## ---------------------------------------------------------------------
table(seqnames(gr))
table(strand(gr))
sum(width(gr))
table(gr)
summary(mcols(gr)[,"score"])
## The number of lines displayed in the 'show' method are controlled
## with two global options:
longGR <- sample(gr, 25, replace=TRUE)
longGR
options(showHeadLines=7)
options(showTailLines=2)
longGR
## Revert to default values
options(showHeadLines=NULL)
options(showTailLines=NULL)
## ---------------------------------------------------------------------
## INVERTING THE STRAND
## ---------------------------------------------------------------------
invertStrand(gr)
## ---------------------------------------------------------------------
## RENAMING THE UNDERLYING SEQUENCES
## ---------------------------------------------------------------------
seqlevels(gr)
seqlevels(gr) <- sub("chr", "Chrom", seqlevels(gr))
gr
seqlevels(gr) <- sub("Chrom", "chr", seqlevels(gr)) # revert
## ---------------------------------------------------------------------
## COMBINING OBJECTS
## ---------------------------------------------------------------------
gr2 <- GRanges(seqnames=Rle(c('chr1', 'chr2', 'chr3'), c(3, 3, 4)),
IRanges(1:10, width=5),
strand='-',
score=101:110, GC=runif(10),
seqinfo=seqinfo)
gr3 <- GRanges(seqnames=Rle(c('chr1', 'chr2', 'chr3'), c(3, 4, 3)),
IRanges(101:110, width=10),
strand='-',
score=21:30,
seqinfo=seqinfo)
some.gr <- c(gr, gr2)
## all.gr <- c(gr, gr2, gr3) ## (This would fail)
all.gr <- c(gr, gr2, gr3, ignore.mcols=TRUE)
## ---------------------------------------------------------------------
## USING A GRANGES OBJECT AS A SUBSCRIPT TO SUBSET ANOTHER OBJECT
## ---------------------------------------------------------------------
## Subsetting *by* a GRanges subscript is supported only if the object
## to subset is a named list-like object:
x <- RleList(chr1=101:120, chr2=2:-8, chr3=31:40)
x[gr]
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(GenomicRanges)
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/GenomicRanges/GRanges-class.Rd_%03d_medium.png", width=480, height=480)
> ### Name: GRanges-class
> ### Title: GRanges objects
> ### Aliases: class:GenomicRanges GenomicRanges-class GenomicRanges
> ### class:GRanges GRanges-class GRanges GenomicRangesORGRangesList-class
> ### GenomicRangesORmissing-class GRanges updateObject,GRanges-method
> ### as.character,GenomicRanges-method as.factor,GenomicRanges-method
> ### as.data.frame,GenomicRanges-method
> ### coerce,GenomicRanges,RangesList-method
> ### coerce,GenomicRanges,RangedData-method
> ### coerce,Seqinfo,GenomicRanges-method coerce,Seqinfo,RangesList-method
> ### granges,GenomicRanges-method coerce,GenomicRanges,GRanges-method
> ### coerce,character,GRanges-method coerce,character,GenomicRanges-method
> ### coerce,factor,GRanges-method coerce,factor,GenomicRanges-method
> ### coerce,RangesList,GRanges-method coerce,RangedData,GRanges-method
> ### coerce,Seqinfo,GRanges-method seqnames,GRanges-method
> ### seqnames<-,GenomicRanges-method ranges,GRanges-method
> ### ranges<-,GenomicRanges-method strand,GRanges-method
> ### strand<-,GenomicRanges,ANY-method names,GenomicRanges-method
> ### names<-,GenomicRanges-method seqinfo,GRanges-method
> ### seqinfo<-,GenomicRanges-method score,GenomicRanges-method
> ### score<-,GenomicRanges-method start,GenomicRanges-method
> ### start<-,GenomicRanges-method end,GenomicRanges-method
> ### end<-,GenomicRanges-method width,GenomicRanges-method
> ### width<-,GenomicRanges-method length,GenomicRanges-method
> ### [<-,GRanges-method [,GenomicRanges,ANY-method
> ### [<-,GenomicRanges,ANY,ANY,ANY-method [,List,GenomicRanges-method
> ### [,list,GenomicRanges-method c,GenomicRanges-method
> ### window,GenomicRanges-method $,GenomicRanges-method
> ### $<-,GenomicRanges-method show,GenomicRanges-method
>
> ### ** Examples
>
> ## ---------------------------------------------------------------------
> ## CONSTRUCTION
> ## ---------------------------------------------------------------------
> ## Specifying the bare minimum i.e. seqnames and ranges only. The
> ## GRanges object will have no names, no strand information, and no
> ## metadata columns:
> gr0 <- GRanges(Rle(c("chr2", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
+ IRanges(1:10, width=10:1))
> gr0
GRanges object with 10 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr2 [ 1, 10] *
[2] chr2 [ 2, 10] *
[3] chr2 [ 3, 10] *
[4] chr2 [ 4, 10] *
[5] chr1 [ 5, 10] *
[6] chr1 [ 6, 10] *
[7] chr3 [ 7, 10] *
[8] chr3 [ 8, 10] *
[9] chr3 [ 9, 10] *
[10] chr3 [10, 10] *
-------
seqinfo: 3 sequences from an unspecified genome; no seqlengths
>
> ## Specifying names, strand, metadata columns. They can be set on an
> ## existing object:
> names(gr0) <- head(letters, 10)
> strand(gr0) <- Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2))
> mcols(gr0)$score <- 1:10
> mcols(gr0)$GC <- seq(1, 0, length=10)
> gr0
GRanges object with 10 ranges and 2 metadata columns:
seqnames ranges strand | score GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
a chr2 [ 1, 10] - | 1 1
b chr2 [ 2, 10] + | 2 0.888888888888889
c chr2 [ 3, 10] + | 3 0.777777777777778
d chr2 [ 4, 10] * | 4 0.666666666666667
e chr1 [ 5, 10] * | 5 0.555555555555556
f chr1 [ 6, 10] + | 6 0.444444444444444
g chr3 [ 7, 10] + | 7 0.333333333333333
h chr3 [ 8, 10] + | 8 0.222222222222222
i chr3 [ 9, 10] - | 9 0.111111111111111
j chr3 [10, 10] - | 10 0
-------
seqinfo: 3 sequences from an unspecified genome; no seqlengths
>
> ## ... or specified at construction time:
> gr <- GRanges(Rle(c("chr2", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
+ IRanges(1:10, width=10:1, names=head(letters, 10)),
+ Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)),
+ score=1:10, GC=seq(1, 0, length=10))
> stopifnot(identical(gr0, gr))
>
> ## Specifying the seqinfo. It can be set on an existing object:
> seqinfo <- Seqinfo(paste0("chr", 1:3), c(1000, 2000, 1500), NA, "mock1")
> seqinfo(gr0) <- merge(seqinfo(gr0), seqinfo)
> seqlevels(gr0) <- seqlevels(seqinfo)
>
> ## ... or specified at construction time:
> gr <- GRanges(Rle(c("chr2", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
+ IRanges(1:10, width=10:1, names=head(letters, 10)),
+ Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)),
+ score=1:10, GC=seq(1, 0, length=10),
+ seqinfo=seqinfo)
> stopifnot(identical(gr0, gr))
>
> ## ---------------------------------------------------------------------
> ## COERCION
> ## ---------------------------------------------------------------------
> ## From GRanges:
> as.character(gr)
a b c d e
"chr2:1-10:-" "chr2:2-10:+" "chr2:3-10:+" "chr2:4-10:*" "chr1:5-10:*"
f g h i j
"chr1:6-10:+" "chr3:7-10:+" "chr3:8-10:+" "chr3:9-10:-" "chr3:10-10:-"
> as.factor(gr)
a b c d e f
chr2:1-10:- chr2:2-10:+ chr2:3-10:+ chr2:4-10:* chr1:5-10:* chr1:6-10:+
g h i j
chr3:7-10:+ chr3:8-10:+ chr3:9-10:- chr3:10-10:-
10 Levels: chr1:6-10:+ chr1:5-10:* chr2:2-10:+ chr2:3-10:+ ... chr3:10-10:-
> as.data.frame(gr)
seqnames start end width strand score GC
a chr2 1 10 10 - 1 1.0000000
b chr2 2 10 9 + 2 0.8888889
c chr2 3 10 8 + 3 0.7777778
d chr2 4 10 7 * 4 0.6666667
e chr1 5 10 6 * 5 0.5555556
f chr1 6 10 5 + 6 0.4444444
g chr3 7 10 4 + 7 0.3333333
h chr3 8 10 3 + 8 0.2222222
i chr3 9 10 2 - 9 0.1111111
j chr3 10 10 1 - 10 0.0000000
>
> ## From character to GRanges:
> x1 <- "chr2:56-125"
> as(x1, "GRanges")
GRanges object with 1 range and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr2 [56, 125] *
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
> as(rep(x1, 4), "GRanges")
GRanges object with 4 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr2 [56, 125] *
[2] chr2 [56, 125] *
[3] chr2 [56, 125] *
[4] chr2 [56, 125] *
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
> x2 <- c(A=x1, B="chr1:25-30:-")
> as(x2, "GRanges")
GRanges object with 2 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
A chr2 [56, 125] *
B chr1 [25, 30] -
-------
seqinfo: 2 sequences from an unspecified genome; no seqlengths
>
> ## From data.frame to GRanges:
> df <- data.frame(chrom="chr2", start=11:15, end=20:24)
> gr3 <- as(df, "GRanges")
>
> ## Alternatively, coercion to GRanges can be done by just calling the
> ## GRanges() constructor on the object to coerce:
> gr1 <- GRanges(x1) # same as as(x1, "GRanges")
> gr2 <- GRanges(x2) # same as as(x2, "GRanges")
> gr3 <- GRanges(df) # same as as(df, "GRanges")
>
> ## Sanity checks:
> stopifnot(identical(as(x1, "GRanges"), gr1))
> stopifnot(identical(as(x2, "GRanges"), gr2))
> stopifnot(identical(as(df, "GRanges"), gr3))
>
> ## ---------------------------------------------------------------------
> ## SUMMARIZING ELEMENTS
> ## ---------------------------------------------------------------------
> table(seqnames(gr))
chr1 chr2 chr3
2 4 4
> table(strand(gr))
+ - *
5 3 2
> sum(width(gr))
[1] 55
> table(gr)
gr
chr1:6-10:+ chr1:5-10:* chr2:2-10:+ chr2:3-10:+ chr2:1-10:- chr2:4-10:*
1 1 1 1 1 1
chr3:7-10:+ chr3:8-10:+ chr3:9-10:- chr3:10-10:-
1 1 1 1
> summary(mcols(gr)[,"score"])
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 3.25 5.50 5.50 7.75 10.00
>
> ## The number of lines displayed in the 'show' method are controlled
> ## with two global options:
> longGR <- sample(gr, 25, replace=TRUE)
> longGR
GRanges object with 25 ranges and 2 metadata columns:
seqnames ranges strand | score GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
h chr3 [ 8, 10] + | 8 0.222222222222222
d chr2 [ 4, 10] * | 4 0.666666666666667
j chr3 [10, 10] - | 10 0
d chr2 [ 4, 10] * | 4 0.666666666666667
j chr3 [10, 10] - | 10 0
. ... ... ... . ... ...
f chr1 [ 6, 10] + | 6 0.444444444444444
e chr1 [ 5, 10] * | 5 0.555555555555556
e chr1 [ 5, 10] * | 5 0.555555555555556
i chr3 [ 9, 10] - | 9 0.111111111111111
j chr3 [10, 10] - | 10 0
-------
seqinfo: 3 sequences from mock1 genome
> options(showHeadLines=7)
> options(showTailLines=2)
> longGR
GRanges object with 25 ranges and 2 metadata columns:
seqnames ranges strand | score GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
h chr3 [ 8, 10] + | 8 0.222222222222222
d chr2 [ 4, 10] * | 4 0.666666666666667
j chr3 [10, 10] - | 10 0
d chr2 [ 4, 10] * | 4 0.666666666666667
j chr3 [10, 10] - | 10 0
f chr1 [ 6, 10] + | 6 0.444444444444444
h chr3 [ 8, 10] + | 8 0.222222222222222
. ... ... ... . ... ...
i chr3 [ 9, 10] - | 9 0.111111111111111
j chr3 [10, 10] - | 10 0
-------
seqinfo: 3 sequences from mock1 genome
>
> ## Revert to default values
> options(showHeadLines=NULL)
> options(showTailLines=NULL)
>
> ## ---------------------------------------------------------------------
> ## INVERTING THE STRAND
> ## ---------------------------------------------------------------------
> invertStrand(gr)
GRanges object with 10 ranges and 2 metadata columns:
seqnames ranges strand | score GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
a chr2 [ 1, 10] + | 1 1
b chr2 [ 2, 10] - | 2 0.888888888888889
c chr2 [ 3, 10] - | 3 0.777777777777778
d chr2 [ 4, 10] * | 4 0.666666666666667
e chr1 [ 5, 10] * | 5 0.555555555555556
f chr1 [ 6, 10] - | 6 0.444444444444444
g chr3 [ 7, 10] - | 7 0.333333333333333
h chr3 [ 8, 10] - | 8 0.222222222222222
i chr3 [ 9, 10] + | 9 0.111111111111111
j chr3 [10, 10] + | 10 0
-------
seqinfo: 3 sequences from mock1 genome
>
> ## ---------------------------------------------------------------------
> ## RENAMING THE UNDERLYING SEQUENCES
> ## ---------------------------------------------------------------------
> seqlevels(gr)
[1] "chr1" "chr2" "chr3"
> seqlevels(gr) <- sub("chr", "Chrom", seqlevels(gr))
> gr
GRanges object with 10 ranges and 2 metadata columns:
seqnames ranges strand | score GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
a Chrom2 [ 1, 10] - | 1 1
b Chrom2 [ 2, 10] + | 2 0.888888888888889
c Chrom2 [ 3, 10] + | 3 0.777777777777778
d Chrom2 [ 4, 10] * | 4 0.666666666666667
e Chrom1 [ 5, 10] * | 5 0.555555555555556
f Chrom1 [ 6, 10] + | 6 0.444444444444444
g Chrom3 [ 7, 10] + | 7 0.333333333333333
h Chrom3 [ 8, 10] + | 8 0.222222222222222
i Chrom3 [ 9, 10] - | 9 0.111111111111111
j Chrom3 [10, 10] - | 10 0
-------
seqinfo: 3 sequences from mock1 genome
> seqlevels(gr) <- sub("Chrom", "chr", seqlevels(gr)) # revert
>
> ## ---------------------------------------------------------------------
> ## COMBINING OBJECTS
> ## ---------------------------------------------------------------------
> gr2 <- GRanges(seqnames=Rle(c('chr1', 'chr2', 'chr3'), c(3, 3, 4)),
+ IRanges(1:10, width=5),
+ strand='-',
+ score=101:110, GC=runif(10),
+ seqinfo=seqinfo)
> gr3 <- GRanges(seqnames=Rle(c('chr1', 'chr2', 'chr3'), c(3, 4, 3)),
+ IRanges(101:110, width=10),
+ strand='-',
+ score=21:30,
+ seqinfo=seqinfo)
> some.gr <- c(gr, gr2)
>
> ## all.gr <- c(gr, gr2, gr3) ## (This would fail)
> all.gr <- c(gr, gr2, gr3, ignore.mcols=TRUE)
>
> ## ---------------------------------------------------------------------
> ## USING A GRANGES OBJECT AS A SUBSCRIPT TO SUBSET ANOTHER OBJECT
> ## ---------------------------------------------------------------------
> ## Subsetting *by* a GRanges subscript is supported only if the object
> ## to subset is a named list-like object:
> x <- RleList(chr1=101:120, chr2=2:-8, chr3=31:40)
> x[gr]
RleList of length 10
$chr2
integer-Rle of length 10 with 10 runs
Lengths: 1 1 1 1 1 1 1 1 1 1
Values : 2 1 0 -1 -2 -3 -4 -5 -6 -7
$chr2
integer-Rle of length 9 with 9 runs
Lengths: 1 1 1 1 1 1 1 1 1
Values : 1 0 -1 -2 -3 -4 -5 -6 -7
$chr2
integer-Rle of length 8 with 8 runs
Lengths: 1 1 1 1 1 1 1 1
Values : 0 -1 -2 -3 -4 -5 -6 -7
$chr2
integer-Rle of length 7 with 7 runs
Lengths: 1 1 1 1 1 1 1
Values : -1 -2 -3 -4 -5 -6 -7
$chr1
integer-Rle of length 6 with 6 runs
Lengths: 1 1 1 1 1 1
Values : 105 106 107 108 109 110
...
<5 more elements>
>
>
>
>
>
> dev.off()
null device
1
>