A set of generic functions for getting/setting/modifying the sequence
information stored in an object.
Usage
seqinfo(x)
seqinfo(x, new2old=NULL, force=FALSE) <- value
seqnames(x)
seqnames(x) <- value
seqlevels(x)
seqlevels(x, force=FALSE) <- value
sortSeqlevels(x, X.is.sexchrom=NA)
seqlevelsInUse(x)
seqlevels0(x)
seqlengths(x)
seqlengths(x) <- value
isCircular(x)
isCircular(x) <- value
genome(x)
genome(x) <- value
Arguments
x
The object from/on which to get/set the sequence information.
new2old
The new2old argument allows the user to rename, drop, add and/or
reorder the "sequence levels" in x.
new2old can be NULL or an integer vector with one element
per row in Seqinfo object value (i.e. new2old and
value must have the same length) describing how the "new" sequence
levels should be mapped to the "old" sequence levels, that is, how the
rows in value should be mapped to the rows in seqinfo(x).
The values in new2old must be >= 1 and <= length(seqinfo(x)).
NAs are allowed and indicate sequence levels that are being added.
Old sequence levels that are not represented in new2old will be
dropped, but this will fail if those levels are in use (e.g. if x
is a GRanges object with ranges defined on those
sequence levels) unless force=TRUE is used (see below).
If new2old=NULL, then sequence levels can only be added to the
existing ones, that is, value must have at least as many rows
as seqinfo(x) (i.e. length(values) >= length(seqinfo(x)))
and also seqlevels(values)[seq_len(length(seqlevels(x)))] must be
identical to seqlevels(x).
force
Force dropping sequence levels currently in use. This is achieved by
dropping the elements in x where those levels are used (hence
typically reducing the length of x).
Note that if x is a list-like object (e.g.
GRangesList,
GAlignmentPairs, or
GAlignmentsList), then any list element in
x where at least one of the sequence levels to drop is used is
fully dropped. In other words, the seqlevels setter always
keeps or drops full list elements and never tries to change their
content. This guarantees that the geometry of the list elements is
preserved, which is a desirable property when they represent compound
features (e.g. exons grouped by transcript or paired-end reads).
See below for an example.
value
Typically a Seqinfo object for the seqinfo setter.
Either a named or unnamed character vector for the seqlevels
setter.
A vector containing the sequence information to store for the other
setters.
X.is.sexchrom
A logical indicating whether X refers to the sexual chromosome
or to chromosome with Roman Numeral X. If NA, sortSeqlevels
does its best to "guess".
Details
The Seqinfo class plays a central role for the functions described
in this man page because:
All these functions (except seqinfo, seqlevelsInUse,
and seqlevels0) work on a Seqinfo object.
For classes that implement it, the seqinfo getter should
return a Seqinfo object.
Default seqlevels, seqlengths, isCircular,
and genome getters and setters are provided.
By default, seqlevels(x) does seqlevels(seqinfo(x)),
seqlengths(x) does seqlengths(seqinfo(x)),
isCircular(x) does isCircular(seqinfo(x)),
and genome(x) does genome(seqinfo(x)).
So any class with a seqinfo getter will have all the above
getters work out-of-the-box. If, in addition, the class defines
a seqinfo setter, then all the corresponding setters will
also work out-of-the-box.
Examples of containers that have a seqinfo getter and setter:
the GRanges, GRangesList,
and SummarizedExperiment classes in the
GenomicRanges package;
the GAlignments,
GAlignmentPairs,
and GAlignmentsList classes in the
GenomicAlignments package;
the TxDb class in the
GenomicFeatures package;
the BSgenome class in the
BSgenome package; etc...
The GenomicRanges package defines seqinfo and
seqinfo<- methods for these low-level data types:
List, RangesList and RangedData. Those
objects do not have the means to formally store sequence
information. Thus, the wrappers simply store the Seqinfo
object within metadata(x). Initially, the metadata
is empty, so there is some effort to generate a reasonable
default Seqinfo. The names of any List are
taken as the seqnames, and the universe of
RangesList or RangedData is taken as the
genome.
Note
The full list of methods defined for a given generic can
be seen with e.g. showMethods("seqinfo") or
showMethods("seqnames") (for the getters),
and showMethods("seqinfo<-") or showMethods("seqnames<-")
(for the setters aka replacement methods).
Please be aware that this shows only methods defined in packages
that are currently attached.
Author(s)
H. Pages
See Also
The seqlevelsStyle generic getter and setter.
Seqinfo objects.
GRanges, GRangesList,
and SummarizedExperiment objects in the
GenomicRanges package.
GAlignments,
GAlignmentPairs,
and GAlignmentsList objects in the
GenomicAlignments package.
TxDb objects in the
GenomicFeatures package.
BSgenome objects in the BSgenome package.
seqlevels-wrappers for convenience wrappers to the
seqlevels getter and setter.
rankSeqlevels, on which sortSeqlevels is
based.
Examples
## ---------------------------------------------------------------------
## A. MODIFY THE SEQLEVELS OF AN OBJECT
## ---------------------------------------------------------------------
## Overlap and matching operations between objects require matching
## seqlevels. Often the seqlevels in one must be modified to match
## the other. The seqlevels() function can rename, drop, add and reorder
## seqlevels of an object. Examples below are shown on TxDb
## and GRanges but the approach is the same for all objects that have
## a 'Seqinfo' class.
library(TxDb.Dmelanogaster.UCSC.dm3.ensGene)
txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene
seqlevels(txdb)
## Rename:
seqlevels(txdb) <- sub("chr", "", seqlevels(txdb))
seqlevels(txdb)
seqlevels(txdb) <- paste0("CH", seqlevels(txdb))
seqlevels(txdb)
seqlevels(txdb)[seqlevels(txdb) == "CHM"] <- "M"
seqlevels(txdb)
gr <- GRanges(rep(c("chr2", "chr3", "chrM"), 2), IRanges(1:6, 10))
## Add:
seqlevels(gr) <- c("chr1", seqlevels(gr), "chr4")
seqlevels(gr)
seqlevelsInUse(gr)
## Reorder:
seqlevels(gr) <- rev(seqlevels(gr))
seqlevels(gr)
## Drop all unused seqlevels:
seqlevels(gr) <- seqlevelsInUse(gr)
## Drop some seqlevels in use:
seqlevels(gr, force=TRUE) <- setdiff(seqlevels(gr), "chr3")
gr
## Rename/Add/Reorder:
seqlevels(gr) <- c("chr1", chr2="chr2", chrM="Mitochondrion")
seqlevels(gr)
## ---------------------------------------------------------------------
## B. DROP SEQLEVELS OF A LIST-LIKE OBJECT
## ---------------------------------------------------------------------
grl0 <- GRangesList(GRanges("chr2", IRanges(3:2, 5)),
GRanges("chr5", IRanges(11, 18)),
GRanges(c("chr4", "chr2"), IRanges(7:6, 15)))
grl0
grl1 <- grl0
seqlevels(grl1, force=TRUE) <- c("chr2", "chr5")
grl1 # grl0[[3]] was fully dropped!
## If what is desired is to drop the first range in grl0[[3]] only, or,
## more generally speaking, to drop the ranges within each list element
## that are located on one of the seqlevels to drop, then do:
grl2 <- grl0[seqnames(grl0) %in% c("chr2", "chr5")]
grl2
## Note that the above subsetting doesn't drop any seqlevel:
seqlevels(grl2)
## To drop them (no need to use 'force=TRUE' anymore):
seqlevels(grl2) <- c("chr2", "chr5")
seqlevels(grl2)
## ---------------------------------------------------------------------
## C. SORT SEQLEVELS IN "NATURAL" ORDER
## ---------------------------------------------------------------------
sortSeqlevels(c("11", "Y", "1", "10", "9", "M", "2"))
seqlevels <- c("chrXI", "chrY", "chrI", "chrX", "chrIX", "chrM", "chrII")
sortSeqlevels(seqlevels)
sortSeqlevels(seqlevels, X.is.sexchrom=TRUE)
sortSeqlevels(seqlevels, X.is.sexchrom=FALSE)
seqlevels <- c("chr2RHet", "chr4", "chrUextra", "chrYHet",
"chrM", "chrXHet", "chr2LHet", "chrU",
"chr3L", "chr3R", "chr2R", "chrX")
sortSeqlevels(seqlevels)
gr <- GRanges()
seqlevels(gr) <- seqlevels
sortSeqlevels(gr)
## ---------------------------------------------------------------------
## D. SUBSET OBJECTS BY SEQLEVELS
## ---------------------------------------------------------------------
tx <- transcripts(txdb)
seqlevels(tx)
## Drop 'M', keep all others.
seqlevels(tx, force=TRUE) <- seqlevels(tx)[seqlevels(tx) != "M"]
seqlevels(tx)
## Drop all except 'ch3L' and 'ch3R'.
seqlevels(tx, force=TRUE) <- c("ch3L", "ch3R")
seqlevels(tx)
## ---------------------------------------------------------------------
## E. RESTORE ORIGINAL SEQLEVELS OF A TxDb OBJECT
## ---------------------------------------------------------------------
## Applicable to TxDb objects only.
## Not run:
seqlevels(txdb) <- seqlevels0(txdb)
seqlevels(txdb)
## End(Not run)
## ---------------------------------------------------------------------
## F. FINDING METHODS
## ---------------------------------------------------------------------
showMethods("seqinfo")
showMethods("seqinfo<-")
showMethods("seqnames")
showMethods("seqnames<-")
showMethods("seqlevels")
showMethods("seqlevels<-")
if (interactive()) {
library(GenomicRanges)
?`GRanges-class`
}
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(GenomeInfoDb)
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/GenomeInfoDb/seqinfo.Rd_%03d_medium.png", width=480, height=480)
> ### Name: seqinfo
> ### Title: Accessing/modifying sequence information
> ### Aliases: seqinfo seqinfo<- seqnames seqnames<- seqlevels
> ### seqlevels,ANY-method seqlevels<- seqlevels<-,ANY-method sortSeqlevels
> ### sortSeqlevels,character-method sortSeqlevels,ANY-method
> ### seqlevelsInUse seqlevelsInUse,Vector-method
> ### seqlevelsInUse,CompressedList-method seqlevels0 seqlengths
> ### seqlengths,ANY-method seqlengths<- seqlengths<-,ANY-method isCircular
> ### isCircular,ANY-method isCircular<- isCircular<-,ANY-method genome
> ### genome,ANY-method genome<- genome<-,ANY-method
> ### Keywords: methods
>
> ### ** Examples
>
> ## ---------------------------------------------------------------------
> ## A. MODIFY THE SEQLEVELS OF AN OBJECT
> ## ---------------------------------------------------------------------
> ## Overlap and matching operations between objects require matching
> ## seqlevels. Often the seqlevels in one must be modified to match
> ## the other. The seqlevels() function can rename, drop, add and reorder
> ## seqlevels of an object. Examples below are shown on TxDb
> ## and GRanges but the approach is the same for all objects that have
> ## a 'Seqinfo' class.
>
> library(TxDb.Dmelanogaster.UCSC.dm3.ensGene)
Loading required package: GenomicFeatures
Loading required package: GenomicRanges
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
> txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene
> seqlevels(txdb)
[1] "chr2L" "chr2R" "chr3L" "chr3R" "chr4" "chrX"
[7] "chrU" "chrM" "chr2LHet" "chr2RHet" "chr3LHet" "chr3RHet"
[13] "chrXHet" "chrYHet" "chrUextra"
>
> ## Rename:
> seqlevels(txdb) <- sub("chr", "", seqlevels(txdb))
> seqlevels(txdb)
[1] "2L" "2R" "3L" "3R" "4" "X" "U" "M"
[9] "2LHet" "2RHet" "3LHet" "3RHet" "XHet" "YHet" "Uextra"
>
> seqlevels(txdb) <- paste0("CH", seqlevels(txdb))
> seqlevels(txdb)
[1] "CH2L" "CH2R" "CH3L" "CH3R" "CH4" "CHX"
[7] "CHU" "CHM" "CH2LHet" "CH2RHet" "CH3LHet" "CH3RHet"
[13] "CHXHet" "CHYHet" "CHUextra"
>
> seqlevels(txdb)[seqlevels(txdb) == "CHM"] <- "M"
> seqlevels(txdb)
[1] "CH2L" "CH2R" "CH3L" "CH3R" "CH4" "CHX"
[7] "CHU" "M" "CH2LHet" "CH2RHet" "CH3LHet" "CH3RHet"
[13] "CHXHet" "CHYHet" "CHUextra"
>
> gr <- GRanges(rep(c("chr2", "chr3", "chrM"), 2), IRanges(1:6, 10))
>
> ## Add:
> seqlevels(gr) <- c("chr1", seqlevels(gr), "chr4")
> seqlevels(gr)
[1] "chr1" "chr2" "chr3" "chrM" "chr4"
> seqlevelsInUse(gr)
[1] "chr2" "chr3" "chrM"
>
> ## Reorder:
> seqlevels(gr) <- rev(seqlevels(gr))
> seqlevels(gr)
[1] "chr4" "chrM" "chr3" "chr2" "chr1"
>
> ## Drop all unused seqlevels:
> seqlevels(gr) <- seqlevelsInUse(gr)
>
> ## Drop some seqlevels in use:
> seqlevels(gr, force=TRUE) <- setdiff(seqlevels(gr), "chr3")
> gr
GRanges object with 4 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr2 [1, 10] *
[2] chrM [3, 10] *
[3] chr2 [4, 10] *
[4] chrM [6, 10] *
-------
seqinfo: 2 sequences from an unspecified genome; no seqlengths
>
> ## Rename/Add/Reorder:
> seqlevels(gr) <- c("chr1", chr2="chr2", chrM="Mitochondrion")
> seqlevels(gr)
[1] "chr1" "chr2" "Mitochondrion"
>
> ## ---------------------------------------------------------------------
> ## B. DROP SEQLEVELS OF A LIST-LIKE OBJECT
> ## ---------------------------------------------------------------------
>
> grl0 <- GRangesList(GRanges("chr2", IRanges(3:2, 5)),
+ GRanges("chr5", IRanges(11, 18)),
+ GRanges(c("chr4", "chr2"), IRanges(7:6, 15)))
> grl0
GRangesList object of length 3:
[[1]]
GRanges object with 2 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr2 [3, 5] *
[2] chr2 [2, 5] *
[[2]]
GRanges object with 1 range and 0 metadata columns:
seqnames ranges strand
[1] chr5 [11, 18] *
[[3]]
GRanges object with 2 ranges and 0 metadata columns:
seqnames ranges strand
[1] chr4 [7, 15] *
[2] chr2 [6, 15] *
-------
seqinfo: 3 sequences from an unspecified genome; no seqlengths
>
> grl1 <- grl0
> seqlevels(grl1, force=TRUE) <- c("chr2", "chr5")
> grl1 # grl0[[3]] was fully dropped!
GRangesList object of length 2:
[[1]]
GRanges object with 2 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr2 [3, 5] *
[2] chr2 [2, 5] *
[[2]]
GRanges object with 1 range and 0 metadata columns:
seqnames ranges strand
[1] chr5 [11, 18] *
-------
seqinfo: 2 sequences from an unspecified genome; no seqlengths
>
> ## If what is desired is to drop the first range in grl0[[3]] only, or,
> ## more generally speaking, to drop the ranges within each list element
> ## that are located on one of the seqlevels to drop, then do:
> grl2 <- grl0[seqnames(grl0) %in% c("chr2", "chr5")]
> grl2
GRangesList object of length 3:
[[1]]
GRanges object with 2 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr2 [3, 5] *
[2] chr2 [2, 5] *
[[2]]
GRanges object with 1 range and 0 metadata columns:
seqnames ranges strand
[1] chr5 [11, 18] *
[[3]]
GRanges object with 1 range and 0 metadata columns:
seqnames ranges strand
[1] chr2 [6, 15] *
-------
seqinfo: 3 sequences from an unspecified genome; no seqlengths
>
> ## Note that the above subsetting doesn't drop any seqlevel:
> seqlevels(grl2)
[1] "chr2" "chr5" "chr4"
>
> ## To drop them (no need to use 'force=TRUE' anymore):
> seqlevels(grl2) <- c("chr2", "chr5")
> seqlevels(grl2)
[1] "chr2" "chr5"
>
> ## ---------------------------------------------------------------------
> ## C. SORT SEQLEVELS IN "NATURAL" ORDER
> ## ---------------------------------------------------------------------
>
> sortSeqlevels(c("11", "Y", "1", "10", "9", "M", "2"))
[1] "1" "2" "9" "10" "11" "Y" "M"
>
> seqlevels <- c("chrXI", "chrY", "chrI", "chrX", "chrIX", "chrM", "chrII")
> sortSeqlevels(seqlevels)
[1] "chrI" "chrII" "chrIX" "chrXI" "chrX" "chrY" "chrM"
> sortSeqlevels(seqlevels, X.is.sexchrom=TRUE)
[1] "chrI" "chrII" "chrIX" "chrXI" "chrX" "chrY" "chrM"
> sortSeqlevels(seqlevels, X.is.sexchrom=FALSE)
[1] "chrI" "chrII" "chrIX" "chrX" "chrXI" "chrY" "chrM"
>
> seqlevels <- c("chr2RHet", "chr4", "chrUextra", "chrYHet",
+ "chrM", "chrXHet", "chr2LHet", "chrU",
+ "chr3L", "chr3R", "chr2R", "chrX")
> sortSeqlevels(seqlevels)
[1] "chr2R" "chr3L" "chr3R" "chr4" "chrX" "chrU"
[7] "chrM" "chr2LHet" "chr2RHet" "chrXHet" "chrYHet" "chrUextra"
>
> gr <- GRanges()
> seqlevels(gr) <- seqlevels
> sortSeqlevels(gr)
GRanges object with 0 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
-------
seqinfo: 12 sequences from an unspecified genome; no seqlengths
>
> ## ---------------------------------------------------------------------
> ## D. SUBSET OBJECTS BY SEQLEVELS
> ## ---------------------------------------------------------------------
>
> tx <- transcripts(txdb)
> seqlevels(tx)
[1] "CH2L" "CH2R" "CH3L" "CH3R" "CH4" "CHX"
[7] "CHU" "M" "CH2LHet" "CH2RHet" "CH3LHet" "CH3RHet"
[13] "CHXHet" "CHYHet" "CHUextra"
>
> ## Drop 'M', keep all others.
> seqlevels(tx, force=TRUE) <- seqlevels(tx)[seqlevels(tx) != "M"]
> seqlevels(tx)
[1] "CH2L" "CH2R" "CH3L" "CH3R" "CH4" "CHX"
[7] "CHU" "CH2LHet" "CH2RHet" "CH3LHet" "CH3RHet" "CHXHet"
[13] "CHYHet" "CHUextra"
>
> ## Drop all except 'ch3L' and 'ch3R'.
> seqlevels(tx, force=TRUE) <- c("ch3L", "ch3R")
> seqlevels(tx)
[1] "ch3L" "ch3R"
>
> ## ---------------------------------------------------------------------
> ## E. RESTORE ORIGINAL SEQLEVELS OF A TxDb OBJECT
> ## ---------------------------------------------------------------------
>
> ## Applicable to TxDb objects only.
> ## Not run:
> ##D seqlevels(txdb) <- seqlevels0(txdb)
> ##D seqlevels(txdb)
> ## End(Not run)
>
> ## ---------------------------------------------------------------------
> ## F. FINDING METHODS
> ## ---------------------------------------------------------------------
>
> showMethods("seqinfo")
Function: seqinfo (package GenomeInfoDb)
x="BamFile"
x="BamFileList"
x="BigWigFile"
x="DNAStringSet"
x="DelegatingGenomicRanges"
x="FaFile"
x="GAlignmentPairs"
x="GAlignments"
x="GAlignmentsList"
x="GNCList"
x="GPos"
x="GRanges"
x="GRangesList"
x="GenomeDescription"
x="List"
x="QuickloadGenome"
x="RangedData"
x="RangedSummarizedExperiment"
x="RangesList"
x="TwoBitFile"
x="TxDb"
x="UCSCSession"
> showMethods("seqinfo<-")
Function: seqinfo<- (package GenomeInfoDb)
x="GAlignmentPairs"
x="GAlignments"
x="GAlignmentsList"
x="GPos"
x="GRanges"
(inherited from: x="GenomicRanges")
x="GRangesList"
x="GenomicRanges"
x="List"
x="QuickloadGenome"
x="RangedData"
x="RangedSummarizedExperiment"
>
> showMethods("seqnames")
Function: seqnames (package GenomeInfoDb)
x="DelegatingGenomicRanges"
x="GAlignmentPairs"
x="GAlignments"
x="GAlignmentsList"
x="GNCList"
x="GPos"
x="GRanges"
x="GRangesList"
x="GenomeDescription"
x="RangedData"
x="RangedSummarizedExperiment"
x="Seqinfo"
x="UCSCSession"
> showMethods("seqnames<-")
Function: seqnames<- (package GenomeInfoDb)
x="GAlignments"
x="GAlignmentsList"
x="GRangesList"
x="GenomicRanges"
x="Seqinfo"
>
> showMethods("seqlevels")
Function: seqlevels (package GenomeInfoDb)
x="ANY"
x="GRanges"
(inherited from: x="ANY")
x="GRangesList"
(inherited from: x="ANY")
x="Seqinfo"
x="TxDb"
(inherited from: x="ANY")
> showMethods("seqlevels<-")
Function: seqlevels<- (package GenomeInfoDb)
x="ANY"
x="GRanges"
(inherited from: x="ANY")
x="GRangesList"
(inherited from: x="ANY")
x="Seqinfo"
x="TxDb"
>
> #if (interactive()) {
> library(GenomicRanges)
> ?`GRanges-class`
GRanges-class package:GenomicRanges R Documentation
_G_R_a_n_g_e_s _o_b_j_e_c_t_s
_D_e_s_c_r_i_p_t_i_o_n:
The GRanges class is a container for the genomic locations and
their associated annotations.
_D_e_t_a_i_l_s:
GRanges is a vector of genomic locations and associated
annotations. Each element in the vector is comprised of a sequence
name, an interval, a strand, and optional metadata columns (e.g.
score, GC content, etc.). This information is stored in four
components:
'seqnames' a 'factor' Rle object containing the sequence names.
'ranges' an IRanges object containing the ranges.
'strand' a 'factor' Rle object containing the strand information.
'mcols' a DataFrame object containing the metadata columns.
Columns cannot be named '"seqnames"', '"ranges"', '"strand"',
'"seqlevels"', '"seqlengths"', '"isCircular"', '"start"',
'"end"', '"width"', or '"element"'.
'seqinfo' a Seqinfo object containing information about the set of
genomic sequences present in the GRanges object.
_C_o_n_s_t_r_u_c_t_o_r:
'GRanges(seqnames=NULL, ranges=NULL, strand=NULL, ..., seqlengths=NULL,
seqinfo=NULL)': Creates a GRanges object.
'seqnames' 'NULL', or an Rle object, character vector, or
factor containing the sequence names.
'ranges' 'NULL', or an IRanges object containing the ranges.
'strand' 'NULL', or an Rle object, character vector, or
factor containing the strand information.
'...' Optional metadata columns. These columns cannot be
named '"start"', '"end"', '"width"', or '"element"'.
'seqlengths' 'NULL', or an integer vector named with
'levels(seqnames)' and containing the lengths (or NA) for
each level in 'levels(seqnames)'.
'seqinfo' 'NULL', or a Seqinfo object containing allowed
sequence names, lengths (or NA), and circularity flag,
for each level in 'levels(seqnames)'.
If 'ranges' is not supplied and/or NULL then the constructor
proceeds in 2 steps:
1. An initial GRanges object is created with 'as(seqnames,
"GRanges")'.
2. Then this GRanges object is updated according to whatever
non-NULL remaining arguments were passed to the call to
'GRanges()'.
As a consequence of this behavior, 'GRanges(x)' is equivalent
to 'as(x, "GRanges")'.
_C_o_e_r_c_i_o_n:
In the code snippets below, 'x' is a GRanges object.
'as(from, "GRanges")': Creates a GRanges object from a character
vector, a factor, or a RangedData, or RangesList object.
When 'from' is a character vector (or a factor), each element
in it must represent a genomic range in format
'chr1:2501-2800' (unstranded range) or 'chr1:2501-2800:+'
(stranded range). '..' is also supported as a separator
between the start and end positions. Strand can be '+', '-',
'*', or missing. The names on 'from' are propagated to the
returned GRanges object. See 'as.character()' and
'as.factor()' below for the reverse transformations.
Coercing a data.frame or DataFrame into a GRanges object is
also supported. See 'makeGRangesFromDataFrame' for the
details.
'as(from, "RangedData")': Creates a RangedData object from a GRanges
object. The 'strand' and metadata columns become columns in
the result. The 'seqlengths(from)', 'isCircular(from)', and
'genome(from)' vectors are stored in the metadata columns of
'ranges(rd)'.
'as(from, "RangesList")': Creates a RangesList object from a GRanges
object. The 'strand' and metadata columns become _inner_
metadata columns (i.e. metadata columns on the ranges). The
'seqlengths(from)', 'isCircular(from)', and 'genome(from)'
vectors become the metadata columns.
'as.character(x, ignore.strand=FALSE)': Turn GRanges object 'x' into a
character vector where each range in 'x' is represented by a
string in format 'chr1:2501-2800:+'. If 'ignore.strand' is
TRUE or if _all_ the ranges in 'x' are unstranded (i.e. their
strand is set to '*'), then all the strings in the output are
in format 'chr1:2501-2800'.
The names on 'x' are propagated to the returned character
vector. Its metadata ('metadata(x)') and metadata columns
('mcols(x)') are ignored.
See 'as(from, "GRanges")' above for the reverse
transformation.
'as.factor(x)': Equivalent to
factor(as.character(x), levels=as.character(sort(unique(x))))
See 'as(from, "GRanges")' above for the reverse
transformation.
Note that 'table(x)' is supported on a GRanges object. It is
equivalent to, but much faster than, 'table(as.factor(x))'.
'as.data.frame(x, row.names = NULL, optional = FALSE, ...)': Creates a
data.frame with columns 'seqnames' (factor), 'start'
(integer), 'end' (integer), 'width' (integer), 'strand'
(factor), as well as the additional metadata columns stored
in 'mcols(x)'. Pass an explicit 'stringsAsFactors=TRUE/FALSE'
argument via '...' to override the default conversions for
the metadata columns in 'mcols(x)'.
In the code snippets below, 'x' is a Seqinfo object.
'as(x, "GRanges")', 'as(x, "GenomicRanges")', 'as(x, "RangesList")':
Turns Seqinfo object 'x' (with no 'NA' lengths) into a
GRanges or RangesList.
_A_c_c_e_s_s_o_r_s:
In the following code snippets, 'x' is a GRanges object.
'length(x)': Get the number of elements.
'seqnames(x)', 'seqnames(x) <- value': Get or set the sequence names.
'value' can be an Rle object, a character vector, or a
factor.
'ranges(x)', 'ranges(x) <- value': Get or set the ranges. 'value' can
be a Ranges object.
'names(x)', 'names(x) <- value': Get or set the names of the elements.
'strand(x)', 'strand(x) <- value': Get or set the strand. 'value' can
be an Rle object, character vector, or factor.
'mcols(x, use.names=FALSE)', 'mcols(x) <- value': Get or set the
metadata columns. If 'use.names=TRUE' and the metadata
columns are not 'NULL', then the names of 'x' are propagated
as the row names of the returned DataFrame object. When
setting the metadata columns, the supplied value must be
'NULL' or a data.frame-like object (i.e. DataTable or
data.frame) object holding element-wise metadata.
'elementMetadata(x)', 'elementMetadata(x) <- value', 'values(x)',
'values(x) <- value': Alternatives to 'mcols' functions.
Their use is discouraged.
'seqinfo(x)', 'seqinfo(x) <- value': Get or set the information about
the underlying sequences. 'value' must be a Seqinfo object.
'seqlevels(x)', 'seqlevels(x, force=FALSE) <- value': Get or set the
sequence levels. 'seqlevels(x)' is equivalent to
'seqlevels(seqinfo(x))' or to 'levels(seqnames(x))', those 2
expressions being guaranteed to return identical character
vectors on a GRanges object. 'value' must be a character
vector with no NAs. See '?seqlevels' for more information.
'seqlengths(x)', 'seqlengths(x) <- value': Get or set the sequence
lengths. 'seqlengths(x)' is equivalent to
'seqlengths(seqinfo(x))'. 'value' can be a named
non-negative integer or numeric vector eventually with NAs.
'isCircular(x)', 'isCircular(x) <- value': Get or set the circularity
flags. 'isCircular(x)' is equivalent to
'isCircular(seqinfo(x))'. 'value' must be a named logical
vector eventually with NAs.
'genome(x)', 'genome(x) <- value': Get or set the genome identifier or
assembly name for each sequence. 'genome(x)' is equivalent
to 'genome(seqinfo(x))'. 'value' must be a named character
vector eventually with NAs.
'seqlevelsStyle(x)', 'seqlevelsStyle(x) <- value': Get or set the
seqname style for 'x'. See the seqlevelsStyle generic getter
and setter in the 'GenomeInfoDb' package for more
information.
'score(x), score(x) <- value': Get or set the "score" column from the
element metadata.
'granges(x, use.mcols=FALSE)': Gets a 'GRanges' with only the range
information from 'x', unless 'use.mcols' is 'TRUE', in which
case the metadata columns are also returned. Those columns
will include any "extra column slots" if 'x' is a specialized
'GenomicRanges' derivative.
_R_a_n_g_e_s _m_e_t_h_o_d_s:
In the following code snippets, 'x' is a GRanges object.
'start(x)', 'start(x) <- value': Get or set 'start(ranges(x))'.
'end(x)', 'end(x) <- value': Get or set 'end(ranges(x))'.
'width(x)', 'width(x) <- value': Get or set 'width(ranges(x))'.
_S_p_l_i_t_t_i_n_g _a_n_d _C_o_m_b_i_n_i_n_g:
In the code snippets below, 'x' is a GRanges object.
'append(x, values, after = length(x))': Inserts the 'values' into 'x'
at the position given by 'after', where 'x' and 'values' are
of the same class.
'c(x, ...)': Combines 'x' and the GRanges objects in '...' together.
Any object in '...' must belong to the same class as 'x', or
to one of its subclasses, or must be 'NULL'. The result is
an object of the same class as 'x'.
'c(x, ..., ignore.mcols=FALSE)' If the 'GRanges' objects have metadata
columns (represented as one DataFrame per object), each such
DataFrame must have the same columns in order to combine
successfully. In order to circumvent this restraint, you can
pass in an 'ignore.mcols=TRUE' argument which will combine
all the objects into one and drop all of their metadata
columns.
'split(x, f, drop=FALSE)': Splits 'x' according to 'f' to create a
GRangesList object. If 'f' is a list-like object then 'drop'
is ignored and 'f' is treated as if it was
'rep(seq_len(length(f)), sapply(f, length))', so the returned
object has the same shape as 'f' (it also receives the names
of 'f'). Otherwise, if 'f' is not a list-like object, empty
list elements are removed from the returned object if 'drop'
is 'TRUE'.
_S_u_b_s_e_t_t_i_n_g:
In the code snippets below, 'x' is a GRanges object.
'x[i, j]', 'x[i, j] <- value': Get or set elements 'i' with optional
metadata columns 'mcols(x)[,j]', where 'i' can be missing; an
NA-free logical, numeric, or character vector; or a 'logical'
Rle object.
'x[i, j] <- value': Replaces elements 'i' and optional metadata columns
'j' with 'value'.
'head(x, n = 6L)': If 'n' is non-negative, returns the first n elements
of the GRanges object. If 'n' is negative, returns all but
the last 'abs(n)' elements of the GRanges object.
'rep(x, times, length.out, each)': Repeats the values in 'x' through
one of the following conventions:
'times' Vector giving the number of times to repeat each
element if of length 'length(x)', or to repeat the whole
vector if of length 1.
'length.out' Non-negative integer. The desired length of the
output vector.
'each' Non-negative integer. Each element of 'x' is repeated
'each' times.
'subset(x, subset)': Returns a new object of the same class as 'x' made
of the subset using logical vector 'subset', where missing
values are taken as 'FALSE'.
'tail(x, n = 6L)': If 'n' is non-negative, returns the last n elements
of the GRanges object. If 'n' is negative, returns all but
the first 'abs(n)' elements of the GRanges object.
'window(x, start = NA, end = NA, width = NA, frequency = NULL, delta =
NULL, ...)': Extracts the subsequence window from the GRanges
object using:
'start', 'end', 'width' The start, end, or width of the
window. Two of the three are required.
'frequency', 'delta' Optional arguments that specify the
sampling frequency and increment within the window.
In general, this is more efficient than using '"["' operator.
'window(x, start = NA, end = NA, width = NA, keepLength = TRUE) <-
value': Replaces the subsequence window specified on the left
(i.e. the subsequence in 'x' specified by 'start', 'end' and
'width') by 'value'. 'value' must either be of class
'class(x)', belong to a subclass of 'class(x)', be coercible
to 'class(x)', or be 'NULL'. If 'keepLength' is 'TRUE', the
elements of 'value' are repeated to create a GRanges object
with the same number of elements as the width of the
subsequence window it is replacing. If 'keepLength' is
'FALSE', this replacement method can modify the length of
'x', depending on how the length of the left subsequence
window compares to the length of 'value'.
'x$name', 'x$name <- value': Shortcuts for 'mcols(x)$name' and
'mcols(x)$name <- value', respectively. Provided as a
convenience, for GRanges objects *only*, and as the result of
strong popular demand. Note that those methods are not
consistent with the other '$' and '$<-' methods in the
IRanges/GenomicRanges infrastructure, and might confuse some
users by making them believe that a GRanges object can be
manipulated as a data.frame-like object. Therefore we
recommend using them only interactively, and we discourage
their use in scripts or packages. For the latter, use
'mcols(x)$name' and 'mcols(x)$name <- value', instead of
'x$name' and 'x$name <- value', respectively.
Note that a GRanges object can be used to as a subscript to subset
a list-like object that has names on it. In that case, the names
on the list-like object are interpreted as sequence names. In the
code snippets below, 'x' is a list or List object with names on
it, and the subscript 'gr' is a GRanges object with all its
seqnames being valid 'x' names.
'x[gr]': Return an object of the same class as 'x' and _parallel_ to
'gr'. More precisely, it's conceptually doing:
lapply(gr, function(gr1) x[[seqnames(gr1)]][ranges(gr1)])
_O_t_h_e_r _m_e_t_h_o_d_s:
'show(x)': By default the 'show' method displays 5 head and 5 tail
elements. This can be changed by setting the global options
'showHeadLines' and 'showTailLines'. If the object length is
less than (or equal to) the sum of these 2 options plus 1,
then the full object is displayed. Note that these options
also affect the display of GAlignments and GAlignmentPairs
objects (defined in the 'GenomicAlignments' package), as well
as other objects defined in the 'IRanges' and 'Biostrings'
packages (e.g. IRanges and DNAStringSet objects).
_A_u_t_h_o_r(_s):
P. Aboyoun and H. Pag<c3><a8>s
_S_e_e _A_l_s_o:
* 'makeGRangesFromDataFrame' for making a GRanges object from a
data.frame or DataFrame object.
* 'seqinfo' for accessing/modifying information about the
underlying sequences of a GRanges object.
* The GPos class, a memory-efficient container for storing
genomic _positions_, that is, genomic ranges of width 1.
* GenomicRanges-comparison for comparing and ordering genomic
ranges.
* findOverlaps-methods for finding/counting overlapping genomic
ranges.
* intra-range-methods and inter-range-methods for intra range
and inter range transformations of a GRanges object.
* coverage-methods for computing the coverage of a GRanges
object.
* setops-methods for set operations on GRanges objects.
* nearest-methods for finding the nearest genomic range
neighbor.
* 'absoluteRanges' for transforming genomic ranges into
_absolute_ ranges (i.e. into ranges on the sequence obtained
by virtually concatenating all the sequences in a genome).
* 'tileGenome' for putting tiles on a genome.
* genomicvars for manipulating genomic variables.
* GRangesList objects.
* Ranges objects in the 'IRanges' package.
* Vector, Rle, and DataFrame objects in the 'S4Vectors'
package.
_E_x_a_m_p_l_e_s:
## ---------------------------------------------------------------------
## CONSTRUCTION
## ---------------------------------------------------------------------
## Specifying the bare minimum i.e. seqnames and ranges only. The
## GRanges object will have no names, no strand information, and no
## metadata columns:
gr0 <- GRanges(Rle(c("chr2", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
IRanges(1:10, width=10:1))
gr0
## Specifying names, strand, metadata columns. They can be set on an
## exi