R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Accessing/modifying sequence information

seqinfo

R Documentation

Accessing/modifying sequence information

Description

A set of generic functions for getting/setting/modifying the sequence information stored in an object.

Usage

seqinfo(x)
seqinfo(x, new2old=NULL, force=FALSE) <- value

seqnames(x)
seqnames(x) <- value

seqlevels(x)
seqlevels(x, force=FALSE) <- value
sortSeqlevels(x, X.is.sexchrom=NA)
seqlevelsInUse(x)
seqlevels0(x)

seqlengths(x)
seqlengths(x) <- value

isCircular(x)
isCircular(x) <- value

genome(x)
genome(x) <- value

Arguments

`x`	The object from/on which to get/set the sequence information.
`new2old`	The `new2old` argument allows the user to rename, drop, add and/or reorder the "sequence levels" in `x`. `new2old` can be `NULL` or an integer vector with one element per row in Seqinfo object `value` (i.e. `new2old` and `value` must have the same length) describing how the "new" sequence levels should be mapped to the "old" sequence levels, that is, how the rows in `value` should be mapped to the rows in `seqinfo(x)`. The values in `new2old` must be >= 1 and <= `length(seqinfo(x))`. `NA`s are allowed and indicate sequence levels that are being added. Old sequence levels that are not represented in `new2old` will be dropped, but this will fail if those levels are in use (e.g. if `x` is a GRanges object with ranges defined on those sequence levels) unless `force=TRUE` is used (see below). If `new2old=NULL`, then sequence levels can only be added to the existing ones, that is, `value` must have at least as many rows as `seqinfo(x)` (i.e. `length(values) >= length(seqinfo(x))`) and also `seqlevels(values)[seq_len(length(seqlevels(x)))]` must be identical to `seqlevels(x)`.
`force`	Force dropping sequence levels currently in use. This is achieved by dropping the elements in `x` where those levels are used (hence typically reducing the length of `x`). Note that if `x` is a list-like object (e.g. GRangesList, GAlignmentPairs, or GAlignmentsList), then any list element in `x` where at least one of the sequence levels to drop is used is fully dropped. In other words, the `seqlevels` setter always keeps or drops full list elements and never tries to change their content. This guarantees that the geometry of the list elements is preserved, which is a desirable property when they represent compound features (e.g. exons grouped by transcript or paired-end reads). See below for an example.
`value`	Typically a Seqinfo object for the `seqinfo` setter. Either a named or unnamed character vector for the `seqlevels` setter. A vector containing the sequence information to store for the other setters.
`X.is.sexchrom`	A logical indicating whether X refers to the sexual chromosome or to chromosome with Roman Numeral X. If `NA`, `sortSeqlevels` does its best to "guess".

Details

The Seqinfo class plays a central role for the functions described in this man page because:

All these functions (except seqinfo, seqlevelsInUse, and seqlevels0) work on a Seqinfo object.
For classes that implement it, the seqinfo getter should return a Seqinfo object.
Default seqlevels, seqlengths, isCircular, and genome getters and setters are provided. By default, seqlevels(x) does seqlevels(seqinfo(x)), seqlengths(x) does seqlengths(seqinfo(x)), isCircular(x) does isCircular(seqinfo(x)), and genome(x) does genome(seqinfo(x)). So any class with a seqinfo getter will have all the above getters work out-of-the-box. If, in addition, the class defines a seqinfo setter, then all the corresponding setters will also work out-of-the-box.

Examples of containers that have a seqinfo getter and setter: the GRanges, GRangesList, and SummarizedExperiment classes in the GenomicRanges package; the GAlignments, GAlignmentPairs, and GAlignmentsList classes in the GenomicAlignments package; the TxDb class in the GenomicFeatures package; the BSgenome class in the BSgenome package; etc...

The GenomicRanges package defines seqinfo and seqinfo<- methods for these low-level data types: List, RangesList and RangedData. Those objects do not have the means to formally store sequence information. Thus, the wrappers simply store the Seqinfo object within metadata(x). Initially, the metadata is empty, so there is some effort to generate a reasonable default Seqinfo. The names of any List are taken as the seqnames, and the universe of RangesList or RangedData is taken as the genome.

Note

The full list of methods defined for a given generic can be seen with e.g. showMethods("seqinfo") or showMethods("seqnames") (for the getters), and showMethods("seqinfo<-") or showMethods("seqnames<-") (for the setters aka replacement methods). Please be aware that this shows only methods defined in packages that are currently attached.

Author(s)

H. Pages

Examples

## ---------------------------------------------------------------------
## A. MODIFY THE SEQLEVELS OF AN OBJECT
## ---------------------------------------------------------------------
## Overlap and matching operations between objects require matching
## seqlevels. Often the seqlevels in one must be modified to match 
## the other. The seqlevels() function can rename, drop, add and reorder 
## seqlevels of an object. Examples below are shown on TxDb 
## and GRanges but the approach is the same for all objects that have
## a 'Seqinfo' class.

library(TxDb.Dmelanogaster.UCSC.dm3.ensGene)
txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene
seqlevels(txdb)

## Rename:
seqlevels(txdb) <- sub("chr", "", seqlevels(txdb))
seqlevels(txdb)

seqlevels(txdb) <- paste0("CH", seqlevels(txdb))
seqlevels(txdb)

seqlevels(txdb)[seqlevels(txdb) == "CHM"] <- "M"
seqlevels(txdb)

gr <- GRanges(rep(c("chr2", "chr3", "chrM"), 2), IRanges(1:6, 10))

## Add:
seqlevels(gr) <- c("chr1", seqlevels(gr), "chr4")
seqlevels(gr)
seqlevelsInUse(gr)

## Reorder:
seqlevels(gr) <- rev(seqlevels(gr))
seqlevels(gr)

## Drop all unused seqlevels:
seqlevels(gr) <- seqlevelsInUse(gr)

## Drop some seqlevels in use:
seqlevels(gr, force=TRUE) <- setdiff(seqlevels(gr), "chr3")
gr

## Rename/Add/Reorder:
seqlevels(gr) <- c("chr1", chr2="chr2", chrM="Mitochondrion")
seqlevels(gr)

## ---------------------------------------------------------------------
## B. DROP SEQLEVELS OF A LIST-LIKE OBJECT
## ---------------------------------------------------------------------

grl0 <- GRangesList(GRanges("chr2", IRanges(3:2, 5)),
                    GRanges("chr5", IRanges(11, 18)),
                    GRanges(c("chr4", "chr2"), IRanges(7:6, 15)))
grl0

grl1 <- grl0
seqlevels(grl1, force=TRUE) <- c("chr2", "chr5")
grl1  # grl0[[3]] was fully dropped!

## If what is desired is to drop the first range in grl0[[3]] only, or,
## more generally speaking, to drop the ranges within each list element
## that are located on one of the seqlevels to drop, then do:
grl2 <- grl0[seqnames(grl0) %in% c("chr2", "chr5")]
grl2

## Note that the above subsetting doesn't drop any seqlevel:
seqlevels(grl2)

## To drop them (no need to use 'force=TRUE' anymore):
seqlevels(grl2) <- c("chr2", "chr5")
seqlevels(grl2)

## ---------------------------------------------------------------------
## C. SORT SEQLEVELS IN "NATURAL" ORDER
## ---------------------------------------------------------------------

sortSeqlevels(c("11", "Y", "1", "10", "9", "M", "2"))

seqlevels <- c("chrXI", "chrY", "chrI", "chrX", "chrIX", "chrM", "chrII")
sortSeqlevels(seqlevels)
sortSeqlevels(seqlevels, X.is.sexchrom=TRUE)
sortSeqlevels(seqlevels, X.is.sexchrom=FALSE)

seqlevels <- c("chr2RHet", "chr4", "chrUextra", "chrYHet",
               "chrM", "chrXHet", "chr2LHet", "chrU",
               "chr3L", "chr3R", "chr2R", "chrX")
sortSeqlevels(seqlevels)

gr <- GRanges()
seqlevels(gr) <- seqlevels
sortSeqlevels(gr)

## ---------------------------------------------------------------------
## D. SUBSET OBJECTS BY SEQLEVELS
## ---------------------------------------------------------------------

tx <- transcripts(txdb)
seqlevels(tx)

## Drop 'M', keep all others.
seqlevels(tx, force=TRUE) <- seqlevels(tx)[seqlevels(tx) != "M"]
seqlevels(tx)

## Drop all except 'ch3L' and 'ch3R'.
seqlevels(tx, force=TRUE) <- c("ch3L", "ch3R")
seqlevels(tx)

## ---------------------------------------------------------------------
## E. RESTORE ORIGINAL SEQLEVELS OF A TxDb OBJECT
## ---------------------------------------------------------------------

## Applicable to TxDb objects only.
## Not run: 
seqlevels(txdb) <- seqlevels0(txdb)
seqlevels(txdb)

## End(Not run)

## ---------------------------------------------------------------------
## F. FINDING METHODS
## ---------------------------------------------------------------------

showMethods("seqinfo")
showMethods("seqinfo<-")

showMethods("seqnames")
showMethods("seqnames<-")

showMethods("seqlevels")
showMethods("seqlevels<-")

if (interactive()) {
  library(GenomicRanges)
  ?`GRanges-class`
}

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(GenomeInfoDb)
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/GenomeInfoDb/seqinfo.Rd_%03d_medium.png", width=480, height=480)
> ### Name: seqinfo
> ### Title: Accessing/modifying sequence information
> ### Aliases: seqinfo seqinfo<- seqnames seqnames<- seqlevels
> ###   seqlevels,ANY-method seqlevels<- seqlevels<-,ANY-method sortSeqlevels
> ###   sortSeqlevels,character-method sortSeqlevels,ANY-method
> ###   seqlevelsInUse seqlevelsInUse,Vector-method
> ###   seqlevelsInUse,CompressedList-method seqlevels0 seqlengths
> ###   seqlengths,ANY-method seqlengths<- seqlengths<-,ANY-method isCircular
> ###   isCircular,ANY-method isCircular<- isCircular<-,ANY-method genome
> ###   genome,ANY-method genome<- genome<-,ANY-method
> ### Keywords: methods
> 
> ### ** Examples
> 
> ## ---------------------------------------------------------------------
> ## A. MODIFY THE SEQLEVELS OF AN OBJECT
> ## ---------------------------------------------------------------------
> ## Overlap and matching operations between objects require matching
> ## seqlevels. Often the seqlevels in one must be modified to match 
> ## the other. The seqlevels() function can rename, drop, add and reorder 
> ## seqlevels of an object. Examples below are shown on TxDb 
> ## and GRanges but the approach is the same for all objects that have
> ## a 'Seqinfo' class.
> 
> library(TxDb.Dmelanogaster.UCSC.dm3.ensGene)
Loading required package: GenomicFeatures
Loading required package: GenomicRanges
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene
> seqlevels(txdb)
 [1] "chr2L"     "chr2R"     "chr3L"     "chr3R"     "chr4"      "chrX"     
 [7] "chrU"      "chrM"      "chr2LHet"  "chr2RHet"  "chr3LHet"  "chr3RHet" 
[13] "chrXHet"   "chrYHet"   "chrUextra"
> 
> ## Rename:
> seqlevels(txdb) <- sub("chr", "", seqlevels(txdb))
> seqlevels(txdb)
 [1] "2L"     "2R"     "3L"     "3R"     "4"      "X"      "U"      "M"     
 [9] "2LHet"  "2RHet"  "3LHet"  "3RHet"  "XHet"   "YHet"   "Uextra"
> 
> seqlevels(txdb) <- paste0("CH", seqlevels(txdb))
> seqlevels(txdb)
 [1] "CH2L"     "CH2R"     "CH3L"     "CH3R"     "CH4"      "CHX"     
 [7] "CHU"      "CHM"      "CH2LHet"  "CH2RHet"  "CH3LHet"  "CH3RHet" 
[13] "CHXHet"   "CHYHet"   "CHUextra"
> 
> seqlevels(txdb)[seqlevels(txdb) == "CHM"] <- "M"
> seqlevels(txdb)
 [1] "CH2L"     "CH2R"     "CH3L"     "CH3R"     "CH4"      "CHX"     
 [7] "CHU"      "M"        "CH2LHet"  "CH2RHet"  "CH3LHet"  "CH3RHet" 
[13] "CHXHet"   "CHYHet"   "CHUextra"
> 
> gr <- GRanges(rep(c("chr2", "chr3", "chrM"), 2), IRanges(1:6, 10))
> 
> ## Add:
> seqlevels(gr) <- c("chr1", seqlevels(gr), "chr4")
> seqlevels(gr)
[1] "chr1" "chr2" "chr3" "chrM" "chr4"
> seqlevelsInUse(gr)
[1] "chr2" "chr3" "chrM"
> 
> ## Reorder:
> seqlevels(gr) <- rev(seqlevels(gr))
> seqlevels(gr)
[1] "chr4" "chrM" "chr3" "chr2" "chr1"
> 
> ## Drop all unused seqlevels:
> seqlevels(gr) <- seqlevelsInUse(gr)
> 
> ## Drop some seqlevels in use:
> seqlevels(gr, force=TRUE) <- setdiff(seqlevels(gr), "chr3")
> gr
GRanges object with 4 ranges and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr2   [1, 10]      *
  [2]     chrM   [3, 10]      *
  [3]     chr2   [4, 10]      *
  [4]     chrM   [6, 10]      *
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths
> 
> ## Rename/Add/Reorder:
> seqlevels(gr) <- c("chr1", chr2="chr2", chrM="Mitochondrion")
> seqlevels(gr)
[1] "chr1"          "chr2"          "Mitochondrion"
> 
> ## ---------------------------------------------------------------------
> ## B. DROP SEQLEVELS OF A LIST-LIKE OBJECT
> ## ---------------------------------------------------------------------
> 
> grl0 <- GRangesList(GRanges("chr2", IRanges(3:2, 5)),
+                     GRanges("chr5", IRanges(11, 18)),
+                     GRanges(c("chr4", "chr2"), IRanges(7:6, 15)))
> grl0
GRangesList object of length 3:
[[1]] 
GRanges object with 2 ranges and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr2    [3, 5]      *
  [2]     chr2    [2, 5]      *

[[2]] 
GRanges object with 1 range and 0 metadata columns:
      seqnames   ranges strand
  [1]     chr5 [11, 18]      *

[[3]] 
GRanges object with 2 ranges and 0 metadata columns:
      seqnames  ranges strand
  [1]     chr4 [7, 15]      *
  [2]     chr2 [6, 15]      *

-------
seqinfo: 3 sequences from an unspecified genome; no seqlengths
> 
> grl1 <- grl0
> seqlevels(grl1, force=TRUE) <- c("chr2", "chr5")
> grl1  # grl0[[3]] was fully dropped!
GRangesList object of length 2:
[[1]] 
GRanges object with 2 ranges and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr2    [3, 5]      *
  [2]     chr2    [2, 5]      *

[[2]] 
GRanges object with 1 range and 0 metadata columns:
      seqnames   ranges strand
  [1]     chr5 [11, 18]      *

-------
seqinfo: 2 sequences from an unspecified genome; no seqlengths
> 
> ## If what is desired is to drop the first range in grl0[[3]] only, or,
> ## more generally speaking, to drop the ranges within each list element
> ## that are located on one of the seqlevels to drop, then do:
> grl2 <- grl0[seqnames(grl0) %in% c("chr2", "chr5")]
> grl2
GRangesList object of length 3:
[[1]] 
GRanges object with 2 ranges and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr2    [3, 5]      *
  [2]     chr2    [2, 5]      *

[[2]] 
GRanges object with 1 range and 0 metadata columns:
      seqnames   ranges strand
  [1]     chr5 [11, 18]      *

[[3]] 
GRanges object with 1 range and 0 metadata columns:
      seqnames  ranges strand
  [1]     chr2 [6, 15]      *

-------
seqinfo: 3 sequences from an unspecified genome; no seqlengths
> 
> ## Note that the above subsetting doesn't drop any seqlevel:
> seqlevels(grl2)
[1] "chr2" "chr5" "chr4"
> 
> ## To drop them (no need to use 'force=TRUE' anymore):
> seqlevels(grl2) <- c("chr2", "chr5")
> seqlevels(grl2)
[1] "chr2" "chr5"
> 
> ## ---------------------------------------------------------------------
> ## C. SORT SEQLEVELS IN "NATURAL" ORDER
> ## ---------------------------------------------------------------------
> 
> sortSeqlevels(c("11", "Y", "1", "10", "9", "M", "2"))
[1] "1"  "2"  "9"  "10" "11" "Y"  "M" 
> 
> seqlevels <- c("chrXI", "chrY", "chrI", "chrX", "chrIX", "chrM", "chrII")
> sortSeqlevels(seqlevels)
[1] "chrI"  "chrII" "chrIX" "chrXI" "chrX"  "chrY"  "chrM" 
> sortSeqlevels(seqlevels, X.is.sexchrom=TRUE)
[1] "chrI"  "chrII" "chrIX" "chrXI" "chrX"  "chrY"  "chrM" 
> sortSeqlevels(seqlevels, X.is.sexchrom=FALSE)
[1] "chrI"  "chrII" "chrIX" "chrX"  "chrXI" "chrY"  "chrM" 
> 
> seqlevels <- c("chr2RHet", "chr4", "chrUextra", "chrYHet",
+                "chrM", "chrXHet", "chr2LHet", "chrU",
+                "chr3L", "chr3R", "chr2R", "chrX")
> sortSeqlevels(seqlevels)
 [1] "chr2R"     "chr3L"     "chr3R"     "chr4"      "chrX"      "chrU"     
 [7] "chrM"      "chr2LHet"  "chr2RHet"  "chrXHet"   "chrYHet"   "chrUextra"
> 
> gr <- GRanges()
> seqlevels(gr) <- seqlevels
> sortSeqlevels(gr)
GRanges object with 0 ranges and 0 metadata columns:
   seqnames    ranges strand
      <Rle> <IRanges>  <Rle>
  -------
  seqinfo: 12 sequences from an unspecified genome; no seqlengths
> 
> ## ---------------------------------------------------------------------
> ## D. SUBSET OBJECTS BY SEQLEVELS
> ## ---------------------------------------------------------------------
> 
> tx <- transcripts(txdb)
> seqlevels(tx)
 [1] "CH2L"     "CH2R"     "CH3L"     "CH3R"     "CH4"      "CHX"     
 [7] "CHU"      "M"        "CH2LHet"  "CH2RHet"  "CH3LHet"  "CH3RHet" 
[13] "CHXHet"   "CHYHet"   "CHUextra"
> 
> ## Drop 'M', keep all others.
> seqlevels(tx, force=TRUE) <- seqlevels(tx)[seqlevels(tx) != "M"]
> seqlevels(tx)
 [1] "CH2L"     "CH2R"     "CH3L"     "CH3R"     "CH4"      "CHX"     
 [7] "CHU"      "CH2LHet"  "CH2RHet"  "CH3LHet"  "CH3RHet"  "CHXHet"  
[13] "CHYHet"   "CHUextra"
> 
> ## Drop all except 'ch3L' and 'ch3R'.
> seqlevels(tx, force=TRUE) <- c("ch3L", "ch3R")
> seqlevels(tx)
[1] "ch3L" "ch3R"
> 
> ## ---------------------------------------------------------------------
> ## E. RESTORE ORIGINAL SEQLEVELS OF A TxDb OBJECT
> ## ---------------------------------------------------------------------
> 
> ## Applicable to TxDb objects only.
> ## Not run: 
> ##D seqlevels(txdb) <- seqlevels0(txdb)
> ##D seqlevels(txdb)
> ## End(Not run)
> 
> ## ---------------------------------------------------------------------
> ## F. FINDING METHODS
> ## ---------------------------------------------------------------------
> 
> showMethods("seqinfo")
Function: seqinfo (package GenomeInfoDb)
x="BamFile"
x="BamFileList"
x="BigWigFile"
x="DNAStringSet"
x="DelegatingGenomicRanges"
x="FaFile"
x="GAlignmentPairs"
x="GAlignments"
x="GAlignmentsList"
x="GNCList"
x="GPos"
x="GRanges"
x="GRangesList"
x="GenomeDescription"
x="List"
x="QuickloadGenome"
x="RangedData"
x="RangedSummarizedExperiment"
x="RangesList"
x="TwoBitFile"
x="TxDb"
x="UCSCSession"

> showMethods("seqinfo<-")
Function: seqinfo<- (package GenomeInfoDb)
x="GAlignmentPairs"
x="GAlignments"
x="GAlignmentsList"
x="GPos"
x="GRanges"
    (inherited from: x="GenomicRanges")
x="GRangesList"
x="GenomicRanges"
x="List"
x="QuickloadGenome"
x="RangedData"
x="RangedSummarizedExperiment"

> 
> showMethods("seqnames")
Function: seqnames (package GenomeInfoDb)
x="DelegatingGenomicRanges"
x="GAlignmentPairs"
x="GAlignments"
x="GAlignmentsList"
x="GNCList"
x="GPos"
x="GRanges"
x="GRangesList"
x="GenomeDescription"
x="RangedData"
x="RangedSummarizedExperiment"
x="Seqinfo"
x="UCSCSession"

> showMethods("seqnames<-")
Function: seqnames<- (package GenomeInfoDb)
x="GAlignments"
x="GAlignmentsList"
x="GRangesList"
x="GenomicRanges"
x="Seqinfo"

> 
> showMethods("seqlevels")
Function: seqlevels (package GenomeInfoDb)
x="ANY"
x="GRanges"
    (inherited from: x="ANY")
x="GRangesList"
    (inherited from: x="ANY")
x="Seqinfo"
x="TxDb"
    (inherited from: x="ANY")

> showMethods("seqlevels<-")
Function: seqlevels<- (package GenomeInfoDb)
x="ANY"
x="GRanges"
    (inherited from: x="ANY")
x="GRangesList"
    (inherited from: x="ANY")
x="Seqinfo"
x="TxDb"

> 
> #if (interactive()) {
>   library(GenomicRanges)
>   ?`GRanges-class`
GRanges-class          package:GenomicRanges           R Documentation

_G_R_a_n_g_e_s _o_b_j_e_c_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     The GRanges class is a container for the genomic locations and
     their associated annotations.

_D_e_t_a_i_l_s:

     GRanges is a vector of genomic locations and associated
     annotations. Each element in the vector is comprised of a sequence
     name, an interval, a strand, and optional metadata columns (e.g.
     score, GC content, etc.). This information is stored in four
     components:

     'seqnames' a 'factor' Rle object containing the sequence names.

     'ranges' an IRanges object containing the ranges.

     'strand' a 'factor' Rle object containing the strand information.

     'mcols' a DataFrame object containing the metadata columns.
          Columns cannot be named '"seqnames"', '"ranges"', '"strand"',
          '"seqlevels"', '"seqlengths"', '"isCircular"', '"start"',
          '"end"', '"width"', or '"element"'.

     'seqinfo' a Seqinfo object containing information about the set of
          genomic sequences present in the GRanges object.

_C_o_n_s_t_r_u_c_t_o_r:

'GRanges(seqnames=NULL, ranges=NULL, strand=NULL, ..., seqlengths=NULL,
          seqinfo=NULL)': Creates a GRanges object.

          'seqnames' 'NULL', or an Rle object, character vector, or
              factor containing the sequence names.

          'ranges' 'NULL', or an IRanges object containing the ranges.

          'strand' 'NULL', or an Rle object, character vector, or
              factor containing the strand information.

          '...'  Optional metadata columns. These columns cannot be
              named '"start"', '"end"', '"width"', or '"element"'.

          'seqlengths' 'NULL', or an integer vector named with
              'levels(seqnames)' and containing the lengths (or NA) for
              each level in 'levels(seqnames)'.

          'seqinfo' 'NULL', or a Seqinfo object containing allowed
              sequence names, lengths (or NA), and circularity flag,
              for each level in 'levels(seqnames)'.

          If 'ranges' is not supplied and/or NULL then the constructor
          proceeds in 2 steps:

           1. An initial GRanges object is created with 'as(seqnames,
              "GRanges")'.

           2. Then this GRanges object is updated according to whatever
              non-NULL remaining arguments were passed to the call to
              'GRanges()'.

          As a consequence of this behavior, 'GRanges(x)' is equivalent
          to 'as(x, "GRanges")'.

_C_o_e_r_c_i_o_n:

     In the code snippets below, 'x' is a GRanges object.

'as(from, "GRanges")': Creates a GRanges object from a character
          vector, a factor, or a RangedData, or RangesList object.

          When 'from' is a character vector (or a factor), each element
          in it must represent a genomic range in format
          'chr1:2501-2800' (unstranded range) or 'chr1:2501-2800:+'
          (stranded range).  '..' is also supported as a separator
          between the start and end positions. Strand can be '+', '-',
          '*', or missing.  The names on 'from' are propagated to the
          returned GRanges object.  See 'as.character()' and
          'as.factor()' below for the reverse transformations.

          Coercing a data.frame or DataFrame into a GRanges object is
          also supported. See 'makeGRangesFromDataFrame' for the
          details.

'as(from, "RangedData")': Creates a RangedData object from a GRanges
          object. The 'strand' and metadata columns become columns in
          the result. The 'seqlengths(from)', 'isCircular(from)', and
          'genome(from)' vectors are stored in the metadata columns of
          'ranges(rd)'.

'as(from, "RangesList")': Creates a RangesList object from a GRanges
          object. The 'strand' and metadata columns become _inner_
          metadata columns (i.e. metadata columns on the ranges).  The
          'seqlengths(from)', 'isCircular(from)', and 'genome(from)'
          vectors become the metadata columns.

'as.character(x, ignore.strand=FALSE)': Turn GRanges object 'x' into a
          character vector where each range in 'x' is represented by a
          string in format 'chr1:2501-2800:+'. If 'ignore.strand' is
          TRUE or if _all_ the ranges in 'x' are unstranded (i.e. their
          strand is set to '*'), then all the strings in the output are
          in format 'chr1:2501-2800'.

          The names on 'x' are propagated to the returned character
          vector.  Its metadata ('metadata(x)') and metadata columns
          ('mcols(x)') are ignored.

          See 'as(from, "GRanges")' above for the reverse
          transformation.

'as.factor(x)': Equivalent to
          
            factor(as.character(x), levels=as.character(sort(unique(x))))

          See 'as(from, "GRanges")' above for the reverse
          transformation.

          Note that 'table(x)' is supported on a GRanges object. It is
          equivalent to, but much faster than, 'table(as.factor(x))'.

'as.data.frame(x, row.names = NULL, optional = FALSE, ...)': Creates a
          data.frame with columns 'seqnames' (factor), 'start'
          (integer), 'end' (integer), 'width' (integer), 'strand'
          (factor), as well as the additional metadata columns stored
          in 'mcols(x)'. Pass an explicit 'stringsAsFactors=TRUE/FALSE'
          argument via '...' to override the default conversions for
          the metadata columns in 'mcols(x)'.

     In the code snippets below, 'x' is a Seqinfo object.

'as(x, "GRanges")', 'as(x, "GenomicRanges")', 'as(x, "RangesList")':
          Turns Seqinfo object 'x' (with no 'NA' lengths) into a
          GRanges or RangesList.

_A_c_c_e_s_s_o_r_s:

     In the following code snippets, 'x' is a GRanges object.

'length(x)': Get the number of elements.

'seqnames(x)', 'seqnames(x) <- value': Get or set the sequence names.
          'value' can be an Rle object, a character vector, or a
          factor.

'ranges(x)', 'ranges(x) <- value': Get or set the ranges. 'value' can
          be a Ranges object.

'names(x)', 'names(x) <- value': Get or set the names of the elements.

'strand(x)', 'strand(x) <- value': Get or set the strand. 'value' can
          be an Rle object, character vector, or factor.

'mcols(x, use.names=FALSE)', 'mcols(x) <- value': Get or set the
          metadata columns.  If 'use.names=TRUE' and the metadata
          columns are not 'NULL', then the names of 'x' are propagated
          as the row names of the returned DataFrame object.  When
          setting the metadata columns, the supplied value must be
          'NULL' or a data.frame-like object (i.e. DataTable or
          data.frame) object holding element-wise metadata.

'elementMetadata(x)', 'elementMetadata(x) <- value', 'values(x)',
          'values(x) <- value': Alternatives to 'mcols' functions.
          Their use is discouraged.

'seqinfo(x)', 'seqinfo(x) <- value': Get or set the information about
          the underlying sequences.  'value' must be a Seqinfo object.

'seqlevels(x)', 'seqlevels(x, force=FALSE) <- value': Get or set the
          sequence levels.  'seqlevels(x)' is equivalent to
          'seqlevels(seqinfo(x))' or to 'levels(seqnames(x))', those 2
          expressions being guaranteed to return identical character
          vectors on a GRanges object.  'value' must be a character
          vector with no NAs.  See '?seqlevels' for more information.

'seqlengths(x)', 'seqlengths(x) <- value': Get or set the sequence
          lengths.  'seqlengths(x)' is equivalent to
          'seqlengths(seqinfo(x))'.  'value' can be a named
          non-negative integer or numeric vector eventually with NAs.

'isCircular(x)', 'isCircular(x) <- value': Get or set the circularity
          flags.  'isCircular(x)' is equivalent to
          'isCircular(seqinfo(x))'.  'value' must be a named logical
          vector eventually with NAs.

'genome(x)', 'genome(x) <- value': Get or set the genome identifier or
          assembly name for each sequence.  'genome(x)' is equivalent
          to 'genome(seqinfo(x))'.  'value' must be a named character
          vector eventually with NAs.

'seqlevelsStyle(x)', 'seqlevelsStyle(x) <- value': Get or set the
          seqname style for 'x'.  See the seqlevelsStyle generic getter
          and setter in the 'GenomeInfoDb' package for more
          information.

'score(x), score(x) <- value': Get or set the "score" column from the
          element metadata.

'granges(x, use.mcols=FALSE)': Gets a 'GRanges' with only the range
          information from 'x', unless 'use.mcols' is 'TRUE', in which
          case the metadata columns are also returned. Those columns
          will include any "extra column slots" if 'x' is a specialized
          'GenomicRanges' derivative.

_R_a_n_g_e_s _m_e_t_h_o_d_s:

     In the following code snippets, 'x' is a GRanges object.

'start(x)', 'start(x) <- value': Get or set 'start(ranges(x))'.

'end(x)', 'end(x) <- value': Get or set 'end(ranges(x))'.

'width(x)', 'width(x) <- value': Get or set 'width(ranges(x))'.

_S_p_l_i_t_t_i_n_g _a_n_d _C_o_m_b_i_n_i_n_g:

     In the code snippets below, 'x' is a GRanges object.

'append(x, values, after = length(x))': Inserts the 'values' into 'x'
          at the position given by 'after', where 'x' and 'values' are
          of the same class.

'c(x, ...)': Combines 'x' and the GRanges objects in '...' together.
          Any object in '...' must belong to the same class as 'x', or
          to one of its subclasses, or must be 'NULL'.  The result is
          an object of the same class as 'x'.

'c(x, ..., ignore.mcols=FALSE)' If the 'GRanges' objects have metadata
          columns (represented as one DataFrame per object), each such
          DataFrame must have the same columns in order to combine
          successfully. In order to circumvent this restraint, you can
          pass in an 'ignore.mcols=TRUE' argument which will combine
          all the objects into one and drop all of their metadata
          columns.

'split(x, f, drop=FALSE)': Splits 'x' according to 'f' to create a
          GRangesList object.  If 'f' is a list-like object then 'drop'
          is ignored and 'f' is treated as if it was
          'rep(seq_len(length(f)), sapply(f, length))', so the returned
          object has the same shape as 'f' (it also receives the names
          of 'f').  Otherwise, if 'f' is not a list-like object, empty
          list elements are removed from the returned object if 'drop'
          is 'TRUE'.

_S_u_b_s_e_t_t_i_n_g:

     In the code snippets below, 'x' is a GRanges object.

'x[i, j]', 'x[i, j] <- value': Get or set elements 'i' with optional
          metadata columns 'mcols(x)[,j]', where 'i' can be missing; an
          NA-free logical, numeric, or character vector; or a 'logical'
          Rle object.

'x[i, j] <- value': Replaces elements 'i' and optional metadata columns
          'j' with 'value'.

'head(x, n = 6L)': If 'n' is non-negative, returns the first n elements
          of the GRanges object.  If 'n' is negative, returns all but
          the last 'abs(n)' elements of the GRanges object.

'rep(x, times, length.out, each)': Repeats the values in 'x' through
          one of the following conventions:

          'times' Vector giving the number of times to repeat each
              element if of length 'length(x)', or to repeat the whole
              vector if of length 1.

          'length.out' Non-negative integer. The desired length of the
              output vector.

          'each' Non-negative integer.  Each element of 'x' is repeated
              'each' times.

'subset(x, subset)': Returns a new object of the same class as 'x' made
          of the subset using logical vector 'subset', where missing
          values are taken as 'FALSE'.

'tail(x, n = 6L)': If 'n' is non-negative, returns the last n elements
          of the GRanges object.  If 'n' is negative, returns all but
          the first 'abs(n)' elements of the GRanges object.

'window(x, start = NA, end = NA, width = NA, frequency = NULL, delta =
          NULL, ...)': Extracts the subsequence window from the GRanges
          object using:

          'start', 'end', 'width' The start, end, or width of the
              window. Two of the three are required.

          'frequency', 'delta' Optional arguments that specify the
              sampling frequency and increment within the window.

          In general, this is more efficient than using '"["' operator.

'window(x, start = NA, end = NA, width = NA, keepLength = TRUE) <-
          value': Replaces the subsequence window specified on the left
          (i.e. the subsequence in 'x' specified by 'start', 'end' and
          'width') by 'value'.  'value' must either be of class
          'class(x)', belong to a subclass of 'class(x)', be coercible
          to 'class(x)', or be 'NULL'.  If 'keepLength' is 'TRUE', the
          elements of 'value' are repeated to create a GRanges object
          with the same number of elements as the width of the
          subsequence window it is replacing.  If 'keepLength' is
          'FALSE', this replacement method can modify the length of
          'x', depending on how the length of the left subsequence
          window compares to the length of 'value'.

'x$name', 'x$name <- value': Shortcuts for 'mcols(x)$name' and
          'mcols(x)$name <- value', respectively. Provided as a
          convenience, for GRanges objects *only*, and as the result of
          strong popular demand.  Note that those methods are not
          consistent with the other '$' and '$<-' methods in the
          IRanges/GenomicRanges infrastructure, and might confuse some
          users by making them believe that a GRanges object can be
          manipulated as a data.frame-like object.  Therefore we
          recommend using them only interactively, and we discourage
          their use in scripts or packages. For the latter, use
          'mcols(x)$name' and 'mcols(x)$name <- value', instead of
          'x$name' and 'x$name <- value', respectively.

     Note that a GRanges object can be used to as a subscript to subset
     a list-like object that has names on it. In that case, the names
     on the list-like object are interpreted as sequence names.  In the
     code snippets below, 'x' is a list or List object with names on
     it, and the subscript 'gr' is a GRanges object with all its
     seqnames being valid 'x' names.

'x[gr]': Return an object of the same class as 'x' and _parallel_ to
          'gr'. More precisely, it's conceptually doing:
          
            lapply(gr, function(gr1) x[[seqnames(gr1)]][ranges(gr1)])

_O_t_h_e_r _m_e_t_h_o_d_s:

'show(x)': By default the 'show' method displays 5 head and 5 tail
          elements. This can be changed by setting the global options
          'showHeadLines' and 'showTailLines'. If the object length is
          less than (or equal to) the sum of these 2 options plus 1,
          then the full object is displayed.  Note that these options
          also affect the display of GAlignments and GAlignmentPairs
          objects (defined in the 'GenomicAlignments' package), as well
          as other objects defined in the 'IRanges' and 'Biostrings'
          packages (e.g.  IRanges and DNAStringSet objects).

_A_u_t_h_o_r(_s):

     P. Aboyoun and H. Pag<c3><a8>s

_S_e_e _A_l_s_o:

        * 'makeGRangesFromDataFrame' for making a GRanges object from a
          data.frame or DataFrame object.

        * 'seqinfo' for accessing/modifying information about the
          underlying sequences of a GRanges object.

        * The GPos class, a memory-efficient container for storing
          genomic _positions_, that is, genomic ranges of width 1.

        * GenomicRanges-comparison for comparing and ordering genomic
          ranges.

        * findOverlaps-methods for finding/counting overlapping genomic
          ranges.

        * intra-range-methods and inter-range-methods for intra range
          and inter range transformations of a GRanges object.

        * coverage-methods for computing the coverage of a GRanges
          object.

        * setops-methods for set operations on GRanges objects.

        * nearest-methods for finding the nearest genomic range
          neighbor.

        * 'absoluteRanges' for transforming genomic ranges into
          _absolute_ ranges (i.e. into ranges on the sequence obtained
          by virtually concatenating all the sequences in a genome).

        * 'tileGenome' for putting tiles on a genome.

        * genomicvars for manipulating genomic variables.

        * GRangesList objects.

        * Ranges objects in the 'IRanges' package.

        * Vector, Rle, and DataFrame objects in the 'S4Vectors'
          package.

_E_x_a_m_p_l_e_s:

     ## ---------------------------------------------------------------------
     ## CONSTRUCTION
     ## ---------------------------------------------------------------------
     ## Specifying the bare minimum i.e. seqnames and ranges only. The
     ## GRanges object will have no names, no strand information, and no
     ## metadata columns:
     gr0 <- GRanges(Rle(c("chr2", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
                    IRanges(1:10, width=10:1))
     gr0
     
     ## Specifying names, strand, metadata columns. They can be set on an
     ## exi