Last data update: 2014.03.03

R: GenomeDescription objects
GenomeDescription-classR Documentation

GenomeDescription objects

Description

A GenomeDescription object holds the meta information describing a given genome.

Details

In general the user will not need to manipulate directly a GenomeDescription instance but will manipulate instead a higher-level object that belongs to a class that extends the GenomeDescription class. For example, the top-level object defined in any BSgenome data package is a BSgenome object and the BSgenome class contains the GenomeDescription class. Thus a BSgenome object is also a GenomeDescription object and can therefore be treated as such. In other words all the methods described below will work on it.

Accessor methods

In the code snippets below, object or x is a GenomeDescription object.

organism(object): Return the scientific name of the organism of the genome e.g. "Homo sapiens", "Mus musculus", "Caenorhabditis elegans", etc...

commonName(object): Return the common name of the organism of the genome e.g. "Human", "Mouse", "Worm", etc...

provider(x): Return the provider of this genome e.g. "UCSC", "BDGP", "FlyBase", etc...

providerVersion(x): Return the provider-side version of this genome. For example UCSC uses versions "hg18", "hg17", etc... for the different Builds of the Human genome.

releaseDate(x): Return the release date of this genome e.g. "Mar. 2006".

releaseName(x): Return the release name of this genome, which is generally made of the name of the organization who assembled it plus its Build version. For example, UCSC uses "hg18" for the version of the Human genome corresponding to the Build 36.1 from NCBI hence the release name for this genome is "NCBI Build 36.1".

bsgenomeName(x): Uses the meta information stored in x to make the name of the corresponding BSgenome data package (see the available.genomes function in the BSgenome package for details about the naming scheme used for those packages). Of course there is no guarantee that a package with that name actually exists.

seqinfo(x)

Gets information about the genome sequences. This information is returned in a Seqinfo object. Each part of the information can be retrieved separately with seqnames(x), seqlengths(x), and isCircular(x), respectively, as described below.

seqnames(x)

Gets the names of the genome sequences. seqnames(x) is equivalent to seqnames(seqinfo(x)).

seqlengths(x)

Gets the lengths of the genome sequences. seqlengths(x) is equivalent to seqlengths(seqinfo(x)).

isCircular(x)

Returns the circularity flags of the genome sequences. isCircular(x) is equivalent to isCircular(seqinfo(x)).

Author(s)

H. Pages

See Also

  • The available.genomes function and the BSgenome class in the BSgenome package.

  • The Seqinfo class.

Examples

library(BSgenome.Celegans.UCSC.ce2)
class(Celegans)
is(Celegans, "GenomeDescription")
provider(Celegans)
seqinfo(Celegans)
gendesc <- as(Celegans, "GenomeDescription")
class(gendesc)
gendesc
provider(gendesc)
seqinfo(gendesc)
bsgenomeName(gendesc)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(GenomeInfoDb)
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/GenomeInfoDb/GenomeDescription-class.Rd_%03d_medium.png", width=480, height=480)
> ### Name: GenomeDescription-class
> ### Title: GenomeDescription objects
> ### Aliases: class:GenomeDescription GenomeDescription-class
> ###   GenomeDescription organism organism,GenomeDescription-method
> ###   commonName commonName,GenomeDescription-method species
> ###   species,GenomeDescription-method provider
> ###   provider,GenomeDescription-method providerVersion
> ###   providerVersion,GenomeDescription-method releaseDate
> ###   releaseDate,GenomeDescription-method releaseName
> ###   releaseName,GenomeDescription-method bsgenomeName
> ###   bsgenomeName,GenomeDescription-method
> ###   seqinfo,GenomeDescription-method seqnames,GenomeDescription-method
> ###   show,GenomeDescription-method
> ### Keywords: methods classes
> 
> ### ** Examples
> 
> library(BSgenome.Celegans.UCSC.ce2)
Loading required package: BSgenome
Loading required package: GenomicRanges
Loading required package: Biostrings
Loading required package: XVector
Loading required package: rtracklayer
> class(Celegans)
[1] "BSgenome"
attr(,"package")
[1] "BSgenome"
> is(Celegans, "GenomeDescription")
[1] TRUE
> provider(Celegans)
[1] "UCSC"
> seqinfo(Celegans)
Seqinfo object with 7 sequences (1 circular) from ce2 genome:
  seqnames seqlengths isCircular genome
  chrI       15080483      FALSE    ce2
  chrII      15279308      FALSE    ce2
  chrIII     13783313      FALSE    ce2
  chrIV      17493791      FALSE    ce2
  chrV       20922231      FALSE    ce2
  chrX       17718849      FALSE    ce2
  chrM          13794       TRUE    ce2
> gendesc <- as(Celegans, "GenomeDescription")
> class(gendesc)
[1] "GenomeDescription"
attr(,"package")
[1] "GenomeInfoDb"
> gendesc
| organism: Caenorhabditis elegans (Worm)
| provider: UCSC
| provider version: ce2
| release date: Mar. 2004
| release name: WormBase v. WS120
| ---
| seqlengths:
|      chrI    chrII   chrIII    chrIV     chrV     chrX     chrM
|  15080483 15279308 13783313 17493791 20922231 17718849    13794
> provider(gendesc)
[1] "UCSC"
> seqinfo(gendesc)
Seqinfo object with 7 sequences (1 circular) from ce2 genome:
  seqnames seqlengths isCircular genome
  chrI       15080483      FALSE    ce2
  chrII      15279308      FALSE    ce2
  chrIII     13783313      FALSE    ce2
  chrIV      17493791      FALSE    ce2
  chrV       20922231      FALSE    ce2
  chrX       17718849      FALSE    ce2
  chrM          13794       TRUE    ce2
> bsgenomeName(gendesc)
[1] "BSgenome.Celegans.UCSC.ce2"
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>