Last data update: 2014.03.03

R: Coerce to file format structures
as-format-methodsR Documentation

Coerce to file format structures

Description

These functions coerce a TxDb object to a GRanges object with metadata columns encoding transcript structures according to the model of a standard file format. Currently, BED and GFF models are supported. If a TxDb is passed to export, when targeting a BED or GFF file, this coercion occurs automatically.

Usage

## S4 method for signature 'TxDb'
asBED(x)
## S4 method for signature 'TxDb'
asGFF(x)

Arguments

x

A TxDb object to coerce to a GRanges, structured as BED or GFF.

Value

For asBED, a GRanges, with the columns name, thickStart, thickEnd, blockStarts, blockSizes added. The thick regions correspond to the CDS regions, and the blocks represent the exons. The transcript IDs are stored in the name column. The ranges are the transcript bounds.

For asGFF, a GRanges, with columns type, Name, ID,, and Parent. The gene structures are expressed according to the conventions defined by the GFF3 spec. There are elements of each type of feature: “gene”, “mRNA” “exon” and “cds”. The Name column contains the gene_id for genes, tx_name for transcripts, and exons and cds regions are NA. The ID column uses gene_id and tx_id, with the prefixes “GeneID” and “TxID” to ensure uniqueness across types. The exons and cds regions have NA for ID. The Parent column contains the IDs of the parent features. A feature may have multiple parents (the column is a CharacterList). Each exon belongs to one or more mRNAs, and mRNAs belong to a gene.

Author(s)

Michael Lawrence

Examples

  txdb_file <- system.file("extdata", "hg19_knownGene_sample.sqlite",
                           package="GenomicFeatures")
  txdb <- loadDb(txdb_file)

  asBED(txdb)
  asGFF(txdb)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(GenomicFeatures)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/GenomicFeatures/as-format-methods.Rd_%03d_medium.png", width=480, height=480)
> ### Name: as-format-methods
> ### Title: Coerce to file format structures
> ### Aliases: asBED,TxDb-method asGFF,TxDb-method
> 
> ### ** Examples
> 
>   txdb_file <- system.file("extdata", "hg19_knownGene_sample.sqlite",
+                            package="GenomicFeatures")
>   txdb <- loadDb(txdb_file)
> 
>   asBED(txdb)
GRanges object with 152 ranges and 3 metadata columns:
              seqnames                 ranges strand |        name
                 <Rle>              <IRanges>  <Rle> | <character>
    [1]           chr1 [ 32671236,  32674288]      + |           1
    [2]           chr1 [ 32671236,  32674288]      + |           2
    [3]           chr1 [ 32671236,  32674288]      + |           3
    [4]           chr1 [153330330, 153333503]      + |           4
    [5]           chr1 [155715559, 155717687]      + |           5
    ...            ...                    ...    ... .         ...
  [148] chr6_ssto_hap7     [ 704942,  705974]      + |         174
  [149] chr6_ssto_hap7     [3270298, 3272742]      - |         175
  [150] chr6_ssto_hap7     [3270298, 3272742]      - |         176
  [151] chr6_ssto_hap7     [3270929, 3272742]      - |         177
  [152] chr6_ssto_hap7     [3271844, 3272742]      - |         178
                                            blocks                  thick
                                     <IRangesList>              <IRanges>
    [1]    [  1,   89] [520,  663] [875, 1127] ... [ 32671283,  32673683]
    [2]    [  1,  329] [520,  663] [875, 1127] ... [ 32671283,  32673683]
    [3] [   1,   89] [ 520, 1127] [1368, 1486] ... [ 32672224,  32673683]
    [4]     [   1,   28] [ 416,  580] [2791, 3174] [153330760, 153333314]
    [5]       [  1, 145] [238, 373] [852, 921] ... [155715620, 155717687]
    ...                                        ...                    ...
  [148]                                  [1, 1033]     [ 705003,  705926]
  [149]       [  1, 214] [299, 393] [533, 668] ...     [3270364, 3272162]
  [150]       [  1, 214] [299, 393] [533, 668] ...     [3270364, 3271127]
  [151]                  [   1, 1240] [1488, 1814]     [3271803, 3272162]
  [152]                      [  1, 369] [573, 899]     [3271845, 3272162]
  -------
  seqinfo: 93 sequences (1 circular) from hg19 genome
Warning message:
Using togroup() on a GRangesList object is deprecated. Please use
  togroup(PartitioningByWidth(...)) instead. 
>   asGFF(txdb)
GRanges object with 1375 ranges and 4 metadata columns:
               seqnames                 ranges strand |          Parent
                  <Rle>              <IRanges>  <Rle> | <CharacterList>
     [1]          chr19 [ 58858172,  58874214]      - |                
     [2]           chr1 [155715559, 155720673]      + |                
     [3]           chr6 [ 10412551,  10416402]      + |                
     [4]           chr8 [128808208, 128808274]      + |                
     [5]          chr13 [ 39917029,  40177356]      - |                
     ...            ...                    ...    ... .             ...
  [1371] chr6_ssto_hap7     [3271093, 3271312]      - |        TxID:175
  [1372] chr6_ssto_hap7     [3271399, 3271634]      - |        TxID:175
  [1373] chr6_ssto_hap7     [3271803, 3272162]      - |        TxID:177
  [1374] chr6_ssto_hap7     [3271807, 3272162]      - |        TxID:175
  [1375] chr6_ssto_hap7     [3271845, 3272162]      - |        TxID:178
                       ID        Name        type
              <character> <character> <character>
     [1]         GeneID:1           1        gene
     [2] GeneID:100129405   100129405        gene
     [3] GeneID:100130275   100130275        gene
     [4] GeneID:100302185   100302185        gene
     [5]     GeneID:10186       10186        gene
     ...              ...         ...         ...
  [1371]             <NA>        <NA>         CDS
  [1372]             <NA>        <NA>         CDS
  [1373]             <NA>        <NA>         CDS
  [1374]             <NA>        <NA>         CDS
  [1375]             <NA>        <NA>         CDS
  -------
  seqinfo: 93 sequences (1 circular) from hg19 genome
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>