Last data update: 2014.03.03
R: Coerce to file format structures
as-format-methods R Documentation
Coerce to file format structures
Description
These functions coerce a TxDb
object to a
GRanges
object with
metadata columns encoding transcript structures according to the
model of a standard file format. Currently, BED and GFF models are
supported. If a TxDb
is passed to
export
, when targeting a BED or GFF file,
this coercion occurs automatically.
Usage
## S4 method for signature 'TxDb'
asBED(x)
## S4 method for signature 'TxDb'
asGFF(x)
Arguments
x
A TxDb
object to coerce to a GRanges
,
structured as BED or GFF.
Value
For asBED
, a GRanges
, with the columns name
,
thickStart
, thickEnd
, blockStarts
,
blockSizes
added. The thick regions correspond to the CDS
regions, and the blocks represent the exons. The transcript IDs are
stored in the name
column. The ranges are the transcript bounds.
For asGFF
, a GRanges
, with columns type
,
Name
, ID
,, and Parent
. The gene structures are
expressed according to the conventions defined by the GFF3 spec. There
are elements of each type
of feature: “gene”,
“mRNA” “exon” and “cds”. The Name
column
contains the gene_id
for genes, tx_name
for transcripts,
and exons and cds regions are NA
. The ID
column uses
gene_id
and tx_id
, with the prefixes “GeneID” and
“TxID” to ensure uniqueness across types. The exons and cds
regions have NA
for ID
. The Parent
column
contains the ID
s of the parent features. A feature may have
multiple parents (the column is a CharacterList
). Each exon
belongs to one or more mRNAs, and mRNAs belong to a gene.
Author(s)
Michael Lawrence
Examples
txdb_file <- system.file("extdata", "hg19_knownGene_sample.sqlite",
package="GenomicFeatures")
txdb <- loadDb(txdb_file)
asBED(txdb)
asGFF(txdb)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(GenomicFeatures)
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/GenomicFeatures/as-format-methods.Rd_%03d_medium.png", width=480, height=480)
> ### Name: as-format-methods
> ### Title: Coerce to file format structures
> ### Aliases: asBED,TxDb-method asGFF,TxDb-method
>
> ### ** Examples
>
> txdb_file <- system.file("extdata", "hg19_knownGene_sample.sqlite",
+ package="GenomicFeatures")
> txdb <- loadDb(txdb_file)
>
> asBED(txdb)
GRanges object with 152 ranges and 3 metadata columns:
seqnames ranges strand | name
<Rle> <IRanges> <Rle> | <character>
[1] chr1 [ 32671236, 32674288] + | 1
[2] chr1 [ 32671236, 32674288] + | 2
[3] chr1 [ 32671236, 32674288] + | 3
[4] chr1 [153330330, 153333503] + | 4
[5] chr1 [155715559, 155717687] + | 5
... ... ... ... . ...
[148] chr6_ssto_hap7 [ 704942, 705974] + | 174
[149] chr6_ssto_hap7 [3270298, 3272742] - | 175
[150] chr6_ssto_hap7 [3270298, 3272742] - | 176
[151] chr6_ssto_hap7 [3270929, 3272742] - | 177
[152] chr6_ssto_hap7 [3271844, 3272742] - | 178
blocks thick
<IRangesList> <IRanges>
[1] [ 1, 89] [520, 663] [875, 1127] ... [ 32671283, 32673683]
[2] [ 1, 329] [520, 663] [875, 1127] ... [ 32671283, 32673683]
[3] [ 1, 89] [ 520, 1127] [1368, 1486] ... [ 32672224, 32673683]
[4] [ 1, 28] [ 416, 580] [2791, 3174] [153330760, 153333314]
[5] [ 1, 145] [238, 373] [852, 921] ... [155715620, 155717687]
... ... ...
[148] [1, 1033] [ 705003, 705926]
[149] [ 1, 214] [299, 393] [533, 668] ... [3270364, 3272162]
[150] [ 1, 214] [299, 393] [533, 668] ... [3270364, 3271127]
[151] [ 1, 1240] [1488, 1814] [3271803, 3272162]
[152] [ 1, 369] [573, 899] [3271845, 3272162]
-------
seqinfo: 93 sequences (1 circular) from hg19 genome
Warning message:
Using togroup() on a GRangesList object is deprecated. Please use
togroup(PartitioningByWidth(...)) instead.
> asGFF(txdb)
GRanges object with 1375 ranges and 4 metadata columns:
seqnames ranges strand | Parent
<Rle> <IRanges> <Rle> | <CharacterList>
[1] chr19 [ 58858172, 58874214] - |
[2] chr1 [155715559, 155720673] + |
[3] chr6 [ 10412551, 10416402] + |
[4] chr8 [128808208, 128808274] + |
[5] chr13 [ 39917029, 40177356] - |
... ... ... ... . ...
[1371] chr6_ssto_hap7 [3271093, 3271312] - | TxID:175
[1372] chr6_ssto_hap7 [3271399, 3271634] - | TxID:175
[1373] chr6_ssto_hap7 [3271803, 3272162] - | TxID:177
[1374] chr6_ssto_hap7 [3271807, 3272162] - | TxID:175
[1375] chr6_ssto_hap7 [3271845, 3272162] - | TxID:178
ID Name type
<character> <character> <character>
[1] GeneID:1 1 gene
[2] GeneID:100129405 100129405 gene
[3] GeneID:100130275 100130275 gene
[4] GeneID:100302185 100302185 gene
[5] GeneID:10186 10186 gene
... ... ... ...
[1371] <NA> <NA> CDS
[1372] <NA> <NA> CDS
[1373] <NA> <NA> CDS
[1374] <NA> <NA> CDS
[1375] <NA> <NA> CDS
-------
seqinfo: 93 sequences (1 circular) from hg19 genome
>
>
>
>
>
> dev.off()
null device
1
>