R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Make a OrganismDb object from annotations available at the...

makeOrganismDbFromUCSC

R Documentation

Make a OrganismDb object from annotations available at the UCSC Genome Browser

Description

The makeOrganismDbFromUCSC function allows the user to make a OrganismDb object from transcript annotations available at the UCSC Genome Browser.

Usage

makeOrganismDbFromUCSC(
        genome="hg19",
        tablename="knownGene",
        transcript_ids=NULL,
        circ_seqs=DEFAULT_CIRC_SEQS,
        url="http://genome.ucsc.edu/cgi-bin/",
        goldenPath_url="http://hgdownload.cse.ucsc.edu/goldenPath",
        miRBaseBuild=NA)

Arguments

`genome`	genome abbreviation used by UCSC and obtained by `ucscGenomes()[ , "db"]`. For example: `"hg19"`.
`tablename`	name of the UCSC table containing the transcript annotations to retrieve. Use the `supportedUCSCtables` utility function to get the list of supported tables. Note that not all tables are available for all genomes.
`transcript_ids`	optionally, only retrieve transcript annotation data for the specified set of transcript ids. If this is used, then the meta information displayed for the resulting OrganismDb object will say 'Full dataset: no'. Otherwise it will say 'Full dataset: yes'.
`circ_seqs`	a character vector to list out which chromosomes should be marked as circular.
`url,goldenPath_url`	use to specify the location of an alternate UCSC Genome Browser.
`miRBaseBuild`	specify the string for the appropriate build Information from mirbase.db to use for microRNAs. This can be learned by calling `supportedMiRBaseBuildValues`. By default, this value will be set to `NA`, which will inactivate the `microRNAs` accessor.

Details

makeOrganismDbFromUCSC is a convenience function that feeds data from the UCSC source to the lower level OrganismDb function. See ?makeOrganismDbFromBiomart for a similar function that feeds data from a BioMart database.

Value

A OrganismDb object.

Author(s)

M. Carlson and H. Pages

Examples

## Display the list of genomes available at UCSC:
library(rtracklayer)
ucscGenomes()[ , "db"]

## Display the list of tables supported by makeOrganismDbFromUCSC():
supportedUCSCtables()

## Not run: 
## Retrieving a full transcript dataset for Yeast from UCSC:
odb1 <- makeOrganismDbFromUCSC(genome="sacCer2", tablename="ensGene")

## End(Not run)

## Retrieving an incomplete transcript dataset for Mouse from UCSC
## (only transcripts linked to Entrez Gene ID 22290):
transcript_ids <- c(
    "uc009uzf.1",
    "uc009uzg.1",
    "uc009uzh.1",
    "uc009uzi.1",
    "uc009uzj.1"
)

odb2 <- makeOrganismDbFromUCSC(genome="mm9", tablename="knownGene",
                          transcript_ids=transcript_ids)
odb2

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(OrganismDbi)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: GenomicFeatures
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/OrganismDbi/makeOrganismDbFromUCSC.Rd_%03d_medium.png", width=480, height=480)
> ### Name: makeOrganismDbFromUCSC
> ### Title: Make a OrganismDb object from annotations available at the UCSC
> ###   Genome Browser
> ### Aliases: makeOrganismDbFromUCSC
> 
> ### ** Examples
> 
> ## Display the list of genomes available at UCSC:
> library(rtracklayer)
> ucscGenomes()[ , "db"]
  [1] "hg38"     "hg19"     "hg18"     "hg17"     "hg16"     "vicPac2" 
  [7] "vicPac1"  "dasNov3"  "papHam1"  "panPan1"  "aptMan1"  "otoGar3" 
 [13] "papAnu2"  "felCat8"  "felCat5"  "felCat4"  "felCat3"  "panTro4" 
 [19] "panTro3"  "panTro2"  "panTro1"  "criGri1"  "bosTau8"  "bosTau7" 
 [25] "bosTau6"  "bosTau4"  "bosTau3"  "bosTau2"  "macFas5"  "canFam3" 
 [31] "canFam2"  "canFam1"  "turTru2"  "loxAfr3"  "musFur1"  "nomLeu3" 
 [37] "nomLeu2"  "nomLeu1"  "gorGor4"  "gorGor3"  "cavPor3"  "eriEur2" 
 [43] "eriEur1"  "equCab2"  "equCab1"  "dipOrd1"  "triMan1"  "calJac3" 
 [49] "calJac1"  "pteVam1"  "myoLuc2"  "balAcu1"  "mm10"     "mm9"     
 [55] "mm8"      "mm7"      "micMur2"  "micMur1"  "hetGla2"  "hetGla1" 
 [61] "monDom5"  "monDom4"  "monDom1"  "ponAbe2"  "ailMel1"  "susScr3" 
 [67] "susScr2"  "ochPri3"  "ochPri2"  "ornAna2"  "ornAna1"  "oryCun2" 
 [73] "rn6"      "rn5"      "rn4"      "rn3"      "rheMac8"  "rheMac3" 
 [79] "rheMac2"  "proCap1"  "oviAri3"  "oviAri1"  "sorAra2"  "sorAra1" 
 [85] "choHof1"  "speTri2"  "saiBol1"  "tarSyr2"  "tarSyr1"  "sarHar1" 
 [91] "echTel2"  "echTel1"  "tupBel1"  "macEug2"  "cerSim1"  "allMis1" 
 [97] "gadMor1"  "melUnd1"  "galGal4"  "galGal3"  "galGal2"  "latCha1" 
[103] "calMil1"  "fr3"      "fr2"      "fr1"      "petMar2"  "petMar1" 
[109] "anoCar2"  "anoCar1"  "oryLat2"  "geoFor1"  "oreNil2"  "chrPic1" 
[115] "gasAcu1"  "tetNig2"  "tetNig1"  "melGal1"  "xenTro7"  "xenTro3" 
[121] "xenTro2"  "xenTro1"  "taeGut2"  "taeGut1"  "danRer10" "danRer7" 
[127] "danRer6"  "danRer5"  "danRer4"  "danRer3"  "ci2"      "ci1"     
[133] "braFlo1"  "strPur2"  "strPur1"  "apiMel2"  "apiMel1"  "anoGam1" 
[139] "droAna2"  "droAna1"  "droEre1"  "droGri1"  "dm6"      "dm3"     
[145] "dm2"      "dm1"      "droMoj2"  "droMoj1"  "droPer1"  "dp3"     
[151] "dp2"      "droSec1"  "droSim1"  "droVir2"  "droVir1"  "droYak2" 
[157] "droYak1"  "caePb2"   "caePb1"   "cb3"      "cb1"      "ce11"    
[163] "ce10"     "ce6"      "ce4"      "ce2"      "caeJap1"  "caeRem3" 
[169] "caeRem2"  "priPac1"  "aplCal1"  "sacCer3"  "sacCer2"  "sacCer1" 
[175] "eboVir3" 
> 
> ## Display the list of tables supported by makeOrganismDbFromUCSC():
> supportedUCSCtables()
                                               track           subtrack
knownGene                                 UCSC Genes               <NA>
knownGeneOld3                         Old UCSC Genes               <NA>
ccdsGene                                        CCDS               <NA>
refGene                                 RefSeq Genes               <NA>
xenoRefGene                             Other RefSeq               <NA>
vegaGene                                  Vega Genes Vega Protein Genes
vegaPseudoGene                            Vega Genes   Vega Pseudogenes
ensGene                                Ensembl Genes               <NA>
acembly                                AceView Genes               <NA>
sibGene                                    SIB Genes               <NA>
nscanPasaGene                                 N-SCAN    N-SCAN PASA-EST
nscanGene                                     N-SCAN             N-SCAN
sgpGene                                    SGP Genes               <NA>
geneid                                  Geneid Genes               <NA>
genscan                                Genscan Genes               <NA>
exoniphy                                    Exoniphy               <NA>
augustusHints                               Augustus     Augustus Hints
augustusXRA                                 Augustus   Augustus De Novo
augustusAbinitio                            Augustus Augustus Ab Initio
acescan                                      ACEScan               <NA>
lincRNAsTranscripts              lincRNAsTranscripts               <NA>
wgEncodeGencodeManualV3                Gencode Genes     Gencode Manual
wgEncodeGencodeAutoV3                  Gencode Genes       Gencode Auto
wgEncodeGencodePolyaV3                 Gencode Genes      Gencode PolyA
wgEncodeGencodeBasicV19            GENCODE Genes V19               <NA>
wgEncodeGencodeCompV19             GENCODE Genes V19               <NA>
wgEncodeGencodePseudoGeneV19       GENCODE Genes V19               <NA>
wgEncodeGencode2wayConsPseudoV19   GENCODE Genes V19               <NA>
wgEncodeGencodePolyaV19            GENCODE Genes V19               <NA>
wgEncodeGencodeBasicV17            GENCODE Genes V17               <NA>
wgEncodeGencodeCompV17             GENCODE Genes V17               <NA>
wgEncodeGencodePseudoGeneV17       GENCODE Genes V17               <NA>
wgEncodeGencode2wayConsPseudoV17   GENCODE Genes V17               <NA>
wgEncodeGencodePolyaV17            GENCODE Genes V17               <NA>
wgEncodeGencodeBasicV14            GENCODE Genes V14               <NA>
wgEncodeGencodeCompV14             GENCODE Genes V14               <NA>
wgEncodeGencodePseudoGeneV14       GENCODE Genes V14               <NA>
wgEncodeGencode2wayConsPseudoV14   GENCODE Genes V14               <NA>
wgEncodeGencodePolyaV14            GENCODE Genes V14               <NA>
wgEncodeGencodeBasicV7              GENCODE Genes V7               <NA>
wgEncodeGencodeCompV7               GENCODE Genes V7               <NA>
wgEncodeGencodePseudoGeneV7         GENCODE Genes V7               <NA>
wgEncodeGencode2wayConsPseudoV7     GENCODE Genes V7               <NA>
wgEncodeGencodePolyaV7              GENCODE Genes V7               <NA>
flyBaseGene                            FlyBase Genes               <NA>
sgdGene                                    SGD Genes               <NA>
> 
> ## Not run: 
> ##D ## Retrieving a full transcript dataset for Yeast from UCSC:
> ##D odb1 <- makeOrganismDbFromUCSC(genome="sacCer2", tablename="ensGene")
> ## End(Not run)
> 
> ## Retrieving an incomplete transcript dataset for Mouse from UCSC
> ## (only transcripts linked to Entrez Gene ID 22290):
> transcript_ids <- c(
+     "uc009uzf.1",
+     "uc009uzg.1",
+     "uc009uzh.1",
+     "uc009uzi.1",
+     "uc009uzj.1"
+ )
> 
> odb2 <- makeOrganismDbFromUCSC(genome="mm9", tablename="knownGene",
+                           transcript_ids=transcript_ids)
Error in `genome<-`(`*tmp*`, value = "mm9") : 
  Failed to set session genome to 'mm9'
Calls: makeOrganismDbFromUCSC -> makeTxDbFromUCSC -> genome<- -> genome<-
Execution halted