The SplicingGraphs class is a container for storing splicing graphs
together with the gene model that they are based on.
Usage
## Constructor:
SplicingGraphs(x, grouping=NULL, min.ntx=2, max.ntx=NA, check.introns=TRUE)
## SplicingGraphs basic API:
## S4 method for signature 'SplicingGraphs'
length(x)
## S4 method for signature 'SplicingGraphs'
names(x)
## S4 method for signature 'SplicingGraphs'
seqnames(x)
## S4 method for signature 'SplicingGraphs'
strand(x)
## S4 method for signature 'SplicingGraphs,ANY,ANY'
x[i, j, ... , drop=TRUE]
## S4 method for signature 'SplicingGraphs,ANY,ANY'
x[[i, j, ...]]
## S4 method for signature 'SplicingGraphs'
elementNROWS(x)
## S4 method for signature 'SplicingGraphs'
unlist(x, recursive=TRUE, use.names=TRUE)
## S4 method for signature 'SplicingGraphs'
seqinfo(x)
Arguments
x
For SplicingGraphs: A GRangesList object
containing the exons of one or more genes grouped by transcript.
Alternatively, x can be a TxDb
object. See Details section below.
For the methods in the SplicingGraphs basic API: A SplicingGraphs object.
grouping
An optional object that represents the grouping by gene of the top-level
elements (i.e. the transcripts) in x. See Details section
below.
min.ntx, max.ntx
Single integers (or NA for max.ntx) specifying the minimum
and maximum number of transcripts a gene must have to be considered for
inclusion in the object returned by SplicingGraphs. A value of
NA for max.ntx means no maximum.
check.introns
If TRUE, SplicingGraphs checks that, within each transcript,
exons are ordered from 5' to 3' with gaps of at least 1 nucleotide between
them.
i, j, ..., drop
A SplicingGraphs object is a list-like object and therefore it can be
subsetted like a list. When subsetting with [, the result is
another SplicingGraphs object containing only the selected genes.
When subsetting with [[, the result is an unnamed
GRangesList object containing the exons grouped
by transcript.
Like for list, subsetting only accepts 1 argument (i). The
drop argument is ignored and trying to pass any additional
argument (to j or in ...) will raise an error.
recursive, use.names
A SplicingGraphs object is a list-like object and therefore it can be
unlisted with unlist. The result is a
GRangesList object containing the exons
grouped by transcript. By default this object has names on it, and
the names are the gene ids. Note that because each element in this
object represents a transcript (and not a gene), the names are not
unique. If use.names=FALSE is used, the result has no names on it.
The recursive agument is ignored.
Details
The Splicing graph theory only applies to genes that have all the exons
of all their transcripts on the same chromosome and strand. In particular,
in its current form, the splicing graph theory cannot describe
trans-splicing events. The SplicingGraphs constructor will reject
genes that do not satisfy this.
The first argument of the SplicingGraphs constructor, x,
can be either a GRangesList object or a
TxDb object.
When x is a GRangesList object, it must
contain the exons of one or more genes grouped by transcripts. More
precisely, each top-level element in x must contain the genomic
ranges of the exons for a particular transcript. Typically x
will be obtained from a TxDb object
txdb with exonsBy(txdb, by="tx",
use.names=TRUE).
grouping is an optional argument that is only supported
when x is a GRangesList object.
It represents the grouping by gene of the top-level elements (i.e. the
transcripts) in GRangesList object x.
It can be either:
Missing (i.e. NULL). In that case, all the transcripts in
x are considered to belong to the same gene and the
SplicingGraphs object returned by SplicingGraphs will be
unnamed.
A list of integer or character vectors, or an
IntegerList, or a CharacterList
object, of length the number of genes to process, and where
grouping[[i]] is a vector of valid subscripts in
x pointing to all the transcripts of the i-th gene.
A factor, character vector, or integer vector, of the same length
as x and 1 level per gene.
A named GRangesList object containing
transcripts grouped by genes i.e. each top-level element in
grouping contains the genomic ranges of the transcripts
for a particular gene. In that case, the grouping is inferred from
the tx_id (or alternatively tx_name) metadata column of
unlist(grouping) and all the values in that column must
be in names(x).
If x was obtained with
exonsBy(txdb, by="tx", use.names=TRUE),
then the GRangesList object used for grouping
would typically be obtained with
transcriptsBy(txdb, by="gene").
A data.frame or DataFrame with 2 character vector
columns: a gene_id column (factor, character vector, or integer
vector), and a tx_id (or alternatively tx_name) column. In that
case, x must be named and all the values in the tx_id
(or tx_name) column must be in names(x).
Value
For SplicingGraphs: a SplicingGraphs object with 1 element per gene.
For length: the number of genes in x, which is also the number
of splicing graphs in x.
For names: the gene ids. Note that the names on a SplicingGraphs
object are always unique and cannot be modified.
For seqnames: a named factor of the length of x containing
the name of the chromosome for each gene.
For strand: a named factor of the length of x containing
the strand for each gene.
For elementNROWS: the number of transcripts per gene.
For seqinfo: the seqinfo of the GRangesList
or TxDb object that was used to construct
the SplicingGraphs object.
Author(s)
H. Pages
References
Heber, S., Alekseyev, M., Sze, S., Tang, H., and Pevzner, P. A.
Splicing graphs and EST assembly problem
Bioinformatics
Date: Jul 2002
Vol: 18
Pages: S181-S188
Sammeth, M. (2009)
Complete Alternative Splicing Events Are Bubbles in Splicing Graphs
J. Comput. Biol.
Date: Aug 2009
Vol: 16
Pages: 1117-1140
See Also
This man page is part of the SplicingGraphs package.
Please see ?`SplicingGraphs-package` for an overview of the
package and for an index of its man pages.
Other topics related to this man page and documented in other packages:
The exonsBy and
transcriptsBy functions, and
the TxDb class, defined in the
GenomicFeatures package.
The GRangesList class defined in the
GenomicRanges package.
The IntegerList and CharacterList
classes defined in the IRanges package.
The DataFrame class defined in the
S4Vectors package.
Examples
## ---------------------------------------------------------------------
## 1. Load a toy gene model as a TxDb object
## ---------------------------------------------------------------------
library(GenomicFeatures)
suppressWarnings(
toy_genes_txdb <- makeTxDbFromGFF(toy_genes_gff())
)
## ---------------------------------------------------------------------
## 2. Compute all the splicing graphs (1 graph per gene) and return them
## in a SplicingGraphs object
## ---------------------------------------------------------------------
## Extract the exons grouped by transcript:
ex_by_tx <- exonsBy(toy_genes_txdb, by="tx", use.names=TRUE)
## Extract the transcripts grouped by gene:
tx_by_gn <- transcriptsBy(toy_genes_txdb, by="gene")
sg <- SplicingGraphs(ex_by_tx, tx_by_gn)
sg
## Alternatively 'sg' can be constructed directly from the TxDb
## object:
sg2 <- SplicingGraphs(toy_genes_txdb) # same as 'sg'
sg2
## Note that because SplicingGraphs objects have a slot that is an
## environment (for caching the bubbles), they cannot be compared with
## 'identical()' (will always return FALSE). 'all.equal()' should be
## used instead:
stopifnot(isTRUE(all.equal(sg2, sg)))
## 'sg' has 1 element per gene and 'names(sg)' gives the gene ids:
length(sg)
names(sg)
## ---------------------------------------------------------------------
## 3. Basic manipulation of a SplicingGraphs object
## ---------------------------------------------------------------------
## Basic accessors:
seqnames(sg)
strand(sg)
seqinfo(sg)
## Number of transcripts per gene:
elementNROWS(sg)
## The transcripts of a given gene can be extracted with [[. The result
## is an *unnamed* GRangesList object containing the exons grouped by
## transcript:
sg[["geneD"]]
## See '?plotTranscripts' for how to plot those transcripts.
## The transcripts of all the genes can be extracted with unlist(). The
## result is a *named* GRangesList object containing the exons grouped
## by transcript. The names on the object are the gene ids:
ex_by_tx <- unlist(sg)
ex_by_tx
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(SplicingGraphs)
Loading required package: GenomicFeatures
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: GenomicAlignments
Loading required package: SummarizedExperiment
Loading required package: Biostrings
Loading required package: XVector
Loading required package: Rsamtools
Loading required package: Rgraphviz
Loading required package: graph
Attaching package: 'graph'
The following object is masked from 'package:Biostrings':
complement
Loading required package: grid
Attaching package: 'Rgraphviz'
The following objects are masked from 'package:IRanges':
from, to
The following objects are masked from 'package:S4Vectors':
from, to
Warning messages:
1: replacing previous import 'IRanges::from' by 'Rgraphviz::from' when loading 'SplicingGraphs'
2: replacing previous import 'IRanges::to' by 'Rgraphviz::to' when loading 'SplicingGraphs'
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/SplicingGraphs/SplicingGraphs-class.Rd_%03d_medium.png", width=480, height=480)
> ### Name: SplicingGraphs-class
> ### Title: SplicingGraphs objects
> ### Aliases: class:GeneModel GeneModel-class class:SplicingGraphs
> ### SplicingGraphs-class seqnames,GeneModel-method
> ### strand,GeneModel-method seqinfo,GeneModel-method
> ### isCircular<-,GeneModel-method length,SplicingGraphs-method
> ### names,SplicingGraphs-method [,SplicingGraphs,ANY,ANY-method
> ### [[,SplicingGraphs,ANY,ANY-method elementNROWS,SplicingGraphs-method
> ### unlist,SplicingGraphs-method seqnames,SplicingGraphs-method
> ### strand,SplicingGraphs-method seqinfo,SplicingGraphs-method
> ### isCircular<-,SplicingGraphs-method show,SplicingGraphs-method
> ### SplicingGraphs SplicingGraphs,GRangesList-method
> ### SplicingGraphs,TxDb-method
>
> ### ** Examples
>
> ## ---------------------------------------------------------------------
> ## 1. Load a toy gene model as a TxDb object
> ## ---------------------------------------------------------------------
>
> library(GenomicFeatures)
> suppressWarnings(
+ toy_genes_txdb <- makeTxDbFromGFF(toy_genes_gff())
+ )
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
>
> ## ---------------------------------------------------------------------
> ## 2. Compute all the splicing graphs (1 graph per gene) and return them
> ## in a SplicingGraphs object
> ## ---------------------------------------------------------------------
>
> ## Extract the exons grouped by transcript:
> ex_by_tx <- exonsBy(toy_genes_txdb, by="tx", use.names=TRUE)
>
> ## Extract the transcripts grouped by gene:
> tx_by_gn <- transcriptsBy(toy_genes_txdb, by="gene")
>
> sg <- SplicingGraphs(ex_by_tx, tx_by_gn)
> sg
SplicingGraphs object with 5 gene(s) and 13 transcript(s)
>
> ## Alternatively 'sg' can be constructed directly from the TxDb
> ## object:
> sg2 <- SplicingGraphs(toy_genes_txdb) # same as 'sg'
> sg2
SplicingGraphs object with 5 gene(s) and 13 transcript(s)
>
> ## Note that because SplicingGraphs objects have a slot that is an
> ## environment (for caching the bubbles), they cannot be compared with
> ## 'identical()' (will always return FALSE). 'all.equal()' should be
> ## used instead:
> stopifnot(isTRUE(all.equal(sg2, sg)))
>
> ## 'sg' has 1 element per gene and 'names(sg)' gives the gene ids:
> length(sg)
[1] 5
> names(sg)
[1] "geneA" "geneB" "geneC" "geneD" "geneE"
>
> ## ---------------------------------------------------------------------
> ## 3. Basic manipulation of a SplicingGraphs object
> ## ---------------------------------------------------------------------
>
> ## Basic accessors:
> seqnames(sg)
geneA geneB geneC geneD geneE
chrX chrX chrX chrX chrX
Levels: chrX
> strand(sg)
geneA geneB geneC geneD geneE
+ - + + +
Levels: + - *
> seqinfo(sg)
Seqinfo object with 1 sequence from an unspecified genome; no seqlengths:
seqnames seqlengths isCircular genome
chrX NA NA <NA>
>
> ## Number of transcripts per gene:
> elementNROWS(sg)
geneA geneB geneC geneD geneE
2 2 3 4 2
>
> ## The transcripts of a given gene can be extracted with [[. The result
> ## is an *unnamed* GRangesList object containing the exons grouped by
> ## transcript:
> sg[["geneD"]]
GRangesList object of length 4:
[[1]]
GRanges object with 2 ranges and 5 metadata columns:
seqnames ranges strand | exon_id exon_name exon_rank start_SSid
<Rle> <IRanges> <Rle> | <integer> <character> <integer> <integer>
[1] chrX [601, 630] + | 10 Dx2 1 1
[2] chrX [666, 675] + | 12 Dx4 2 5
end_SSid
<integer>
[1] 3
[2] 6
[[2]]
GRanges object with 2 ranges and 5 metadata columns:
seqnames ranges strand | exon_id exon_name exon_rank start_SSid
[1] chrX [601, 620] + | 9 Dx1 1 1
[2] chrX [651, 700] + | 11 Dx3 2 4
end_SSid
[1] 2
[2] 8
[[3]]
GRanges object with 3 ranges and 5 metadata columns:
seqnames ranges strand | exon_id exon_name exon_rank start_SSid
[1] chrX [601, 620] + | 9 Dx1 1 1
[2] chrX [666, 675] + | 12 Dx4 2 5
[3] chrX [691, 700] + | 13 Dx5 3 7
end_SSid
[1] 2
[2] 6
[3] 8
...
<1 more element>
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
>
> ## See '?plotTranscripts' for how to plot those transcripts.
>
> ## The transcripts of all the genes can be extracted with unlist(). The
> ## result is a *named* GRangesList object containing the exons grouped
> ## by transcript. The names on the object are the gene ids:
> ex_by_tx <- unlist(sg)
> ex_by_tx
GRangesList object of length 13:
$geneA
GRanges object with 1 range and 5 metadata columns:
seqnames ranges strand | exon_id exon_name exon_rank start_SSid
<Rle> <IRanges> <Rle> | <integer> <character> <integer> <integer>
[1] chrX [11, 50] + | 2 Ax2 1 1
end_SSid
<integer>
[1] 3
$geneA
GRanges object with 2 ranges and 5 metadata columns:
seqnames ranges strand | exon_id exon_name exon_rank start_SSid
[1] chrX [11, 40] + | 1 Ax1 1 1
[2] chrX [71, 100] + | 3 Ax3 2 4
end_SSid
[1] 2
[2] 5
$geneB
GRanges object with 2 ranges and 5 metadata columns:
seqnames ranges strand | exon_id exon_name exon_rank start_SSid
[1] chrX [251, 300] - | 23 Bx1 1 3
[2] chrX [201, 230] - | 20 Bx2 2 6
end_SSid
[1] 1
[2] 4
...
<10 more elements>
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
>
>
>
>
>
> dev.off()
null device
1
>