Last data update: 2014.03.03

R: Extract the reduced edges and their ranges from a...
rsgedgesByGene-methodsR Documentation

Extract the reduced edges and their ranges from a SplicingGraphs object

Description

rsgedgesByGene and rsgedgesByTranscript are analog to sgedgesByGene and sgedgesByTranscript, but operate on the reduced splicing graphs, that is, the graphs in SplicingGraphs object x are reduced before the edges and their ranges are extracted. The reduced graphs are obtained by removing the uninformative nodes from it. See Details section below.

rsgedges extracts the edges of the reduced splicing graph of a given gene from a SplicingGraphs object.

rsgraph extracts the reduced splicing graph for a given gene from a SplicingGraphs object, and returns it as a plottable graph-like object.

Usage

rsgedgesByGene(x, with.hits.mcols=FALSE, keep.dup.edges=FALSE)

rsgedgesByTranscript(x, with.hits.mcols=FALSE)

rsgedges(x)

rsgraph(x, tx_id.as.edge.label=FALSE, as.igraph=FALSE)

## Related utility:
uninformativeSSids(x)

Arguments

x

A SplicingGraphs object. Must be of length 1 for rsgedges, rsgraph, and uninformativeSSids.

with.hits.mcols

Whether or not to include the hits metadata columns in the returned object. See ?countReads for more information.

keep.dup.edges

Not supported yet.

tx_id.as.edge.label

Whether or not to use the transcript ids as edge labels.

as.igraph

TODO

Details

TODO: Explain graph reduction.

Value

For rsgedgesByGene: A GRangesList object named with the gene ids and where the reduced splicing graph edges are grouped by gene.

TODO: Explain values returned by the other function.

Author(s)

H. Pages

See Also

This man page is part of the SplicingGraphs package. Please see ?`SplicingGraphs-package` for an overview of the package and for an index of its man pages.

Examples

## ---------------------------------------------------------------------
## 1. Make SplicingGraphs object 'sg' from toy gene model (see
##    '?SplicingGraphs')
## ---------------------------------------------------------------------
example(SplicingGraphs)
sg

## 'sg' has 1 element per gene and 'names(sg)' gives the gene ids.
names(sg)

## ---------------------------------------------------------------------
## 2. rsgedgesByGene()
## ---------------------------------------------------------------------
edges_by_gene <- rsgedgesByGene(sg)
edges_by_gene
## 'edges_by_gene' has the length and names of 'sg', that is, the names
## on it are the gene ids and are guaranteed to be unique.

## Extract the reduced edges and their ranges for a given gene:
edges_by_gene[["geneA"]]
## Note that edge with global reduced edge id "geneA:1,2,4,5" is a mixed
## edge obtained by combining together edges "geneA:1,2" (exon),
## "geneA:2,4" (intron), and "geneA:4,5" (exon), during the graph
## reduction.

stopifnot(identical(edges_by_gene["geneB"], rsgedgesByGene(sg["geneB"])))

## ---------------------------------------------------------------------
## 3. sgedgesByTranscript()
## ---------------------------------------------------------------------
#edges_by_tx <- rsgedgesByTranscript(sg)  # not ready yet!
#edges_by_tx

## ---------------------------------------------------------------------
## 4. rsgedges(), rsgraph(), uninformativeSSids()
## ---------------------------------------------------------------------
plot(sgraph(sg["geneB"]))
uninformativeSSids(sg["geneB"])

plot(rsgraph(sg["geneB"]))
rsgedges(sg["geneB"])

## ---------------------------------------------------------------------
## 5. Sanity checks
## ---------------------------------------------------------------------
## TODO: Do the same kind of sanity checks that are done for sgedges()
## vs sgedgesByGene() vs sgedgesByTranscript() (in man page for sgedges).

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(SplicingGraphs)
Loading required package: GenomicFeatures
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: GenomicAlignments
Loading required package: SummarizedExperiment
Loading required package: Biostrings
Loading required package: XVector
Loading required package: Rsamtools
Loading required package: Rgraphviz
Loading required package: graph

Attaching package: 'graph'

The following object is masked from 'package:Biostrings':

    complement

Loading required package: grid

Attaching package: 'Rgraphviz'

The following objects are masked from 'package:IRanges':

    from, to

The following objects are masked from 'package:S4Vectors':

    from, to

Warning messages:
1: replacing previous import 'IRanges::from' by 'Rgraphviz::from' when loading 'SplicingGraphs' 
2: replacing previous import 'IRanges::to' by 'Rgraphviz::to' when loading 'SplicingGraphs' 
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/SplicingGraphs/rsgedgesByGene-methods.Rd_%03d_medium.png", width=480, height=480)
> ### Name: rsgedgesByGene-methods
> ### Title: Extract the reduced edges and their ranges from a SplicingGraphs
> ###   object
> ### Aliases: rsgedgesByGene-methods uninformativeSSids
> ###   uninformativeSSids,ANY-method uninformativeSSids,DataFrame-method
> ###   rsgedgesByTranscript rsgedgesByTranscript,SplicingGraphs-method
> ###   rsgedgesByGene rsgedgesByGene,SplicingGraphs-method rsgedges sgedges2
> ###   rsgraph sgraph2
> 
> ### ** Examples
> 
> ## ---------------------------------------------------------------------
> ## 1. Make SplicingGraphs object 'sg' from toy gene model (see
> ##    '?SplicingGraphs')
> ## ---------------------------------------------------------------------
> example(SplicingGraphs)

SplcnG> ## ---------------------------------------------------------------------
SplcnG> ## 1. Load a toy gene model as a TxDb object
SplcnG> ## ---------------------------------------------------------------------
SplcnG> 
SplcnG> library(GenomicFeatures)

SplcnG> suppressWarnings(
SplcnG+   toy_genes_txdb <- makeTxDbFromGFF(toy_genes_gff())
SplcnG+ )
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK

SplcnG> ## ---------------------------------------------------------------------
SplcnG> ## 2. Compute all the splicing graphs (1 graph per gene) and return them
SplcnG> ##    in a SplicingGraphs object
SplcnG> ## ---------------------------------------------------------------------
SplcnG> 
SplcnG> ## Extract the exons grouped by transcript:
SplcnG> ex_by_tx <- exonsBy(toy_genes_txdb, by="tx", use.names=TRUE)

SplcnG> ## Extract the transcripts grouped by gene:
SplcnG> tx_by_gn <- transcriptsBy(toy_genes_txdb, by="gene")

SplcnG> sg <- SplicingGraphs(ex_by_tx, tx_by_gn)

SplcnG> sg
SplicingGraphs object with 5 gene(s) and 13 transcript(s)

SplcnG> ## Alternatively 'sg' can be constructed directly from the TxDb
SplcnG> ## object:
SplcnG> sg2 <- SplicingGraphs(toy_genes_txdb)  # same as 'sg'

SplcnG> sg2
SplicingGraphs object with 5 gene(s) and 13 transcript(s)

SplcnG> ## Note that because SplicingGraphs objects have a slot that is an
SplcnG> ## environment (for caching the bubbles), they cannot be compared with
SplcnG> ## 'identical()' (will always return FALSE). 'all.equal()' should be
SplcnG> ## used instead:
SplcnG> stopifnot(isTRUE(all.equal(sg2, sg)))

SplcnG> ## 'sg' has 1 element per gene and 'names(sg)' gives the gene ids:
SplcnG> length(sg)
[1] 5

SplcnG> names(sg)
[1] "geneA" "geneB" "geneC" "geneD" "geneE"

SplcnG> ## ---------------------------------------------------------------------
SplcnG> ## 3. Basic manipulation of a SplicingGraphs object
SplcnG> ## ---------------------------------------------------------------------
SplcnG> 
SplcnG> ## Basic accessors:
SplcnG> seqnames(sg)
geneA geneB geneC geneD geneE 
 chrX  chrX  chrX  chrX  chrX 
Levels: chrX

SplcnG> strand(sg)
geneA geneB geneC geneD geneE 
    +     -     +     +     + 
Levels: + - *

SplcnG> seqinfo(sg)
Seqinfo object with 1 sequence from an unspecified genome; no seqlengths:
  seqnames seqlengths isCircular genome
  chrX             NA         NA   <NA>

SplcnG> ## Number of transcripts per gene:
SplcnG> elementNROWS(sg)
geneA geneB geneC geneD geneE 
    2     2     3     4     2 

SplcnG> ## The transcripts of a given gene can be extracted with [[. The result
SplcnG> ## is an *unnamed* GRangesList object containing the exons grouped by
SplcnG> ## transcript:
SplcnG> sg[["geneD"]]
GRangesList object of length 4:
[[1]] 
GRanges object with 2 ranges and 5 metadata columns:
      seqnames     ranges strand |   exon_id   exon_name exon_rank start_SSid
         <Rle>  <IRanges>  <Rle> | <integer> <character> <integer>  <integer>
  [1]     chrX [601, 630]      + |        10         Dx2         1          1
  [2]     chrX [666, 675]      + |        12         Dx4         2          5
       end_SSid
      <integer>
  [1]         3
  [2]         6

[[2]] 
GRanges object with 2 ranges and 5 metadata columns:
      seqnames     ranges strand | exon_id exon_name exon_rank start_SSid
  [1]     chrX [601, 620]      + |       9       Dx1         1          1
  [2]     chrX [651, 700]      + |      11       Dx3         2          4
      end_SSid
  [1]        2
  [2]        8

[[3]] 
GRanges object with 3 ranges and 5 metadata columns:
      seqnames     ranges strand | exon_id exon_name exon_rank start_SSid
  [1]     chrX [601, 620]      + |       9       Dx1         1          1
  [2]     chrX [666, 675]      + |      12       Dx4         2          5
  [3]     chrX [691, 700]      + |      13       Dx5         3          7
      end_SSid
  [1]        2
  [2]        6
  [3]        8

...
<1 more element>
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths

SplcnG> ## See '?plotTranscripts' for how to plot those transcripts.
SplcnG> 
SplcnG> ## The transcripts of all the genes can be extracted with unlist(). The
SplcnG> ## result is a *named* GRangesList object containing the exons grouped
SplcnG> ## by transcript. The names on the object are the gene ids:
SplcnG> ex_by_tx <- unlist(sg)

SplcnG> ex_by_tx
GRangesList object of length 13:
$geneA 
GRanges object with 1 range and 5 metadata columns:
      seqnames    ranges strand |   exon_id   exon_name exon_rank start_SSid
         <Rle> <IRanges>  <Rle> | <integer> <character> <integer>  <integer>
  [1]     chrX  [11, 50]      + |         2         Ax2         1          1
       end_SSid
      <integer>
  [1]         3

$geneA 
GRanges object with 2 ranges and 5 metadata columns:
      seqnames    ranges strand | exon_id exon_name exon_rank start_SSid
  [1]     chrX [11,  40]      + |       1       Ax1         1          1
  [2]     chrX [71, 100]      + |       3       Ax3         2          4
      end_SSid
  [1]        2
  [2]        5

$geneB 
GRanges object with 2 ranges and 5 metadata columns:
      seqnames     ranges strand | exon_id exon_name exon_rank start_SSid
  [1]     chrX [251, 300]      - |      23       Bx1         1          3
  [2]     chrX [201, 230]      - |      20       Bx2         2          6
      end_SSid
  [1]        1
  [2]        4

...
<10 more elements>
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
> sg
SplicingGraphs object with 5 gene(s) and 13 transcript(s)
> 
> ## 'sg' has 1 element per gene and 'names(sg)' gives the gene ids.
> names(sg)
[1] "geneA" "geneB" "geneC" "geneD" "geneE"
> 
> ## ---------------------------------------------------------------------
> ## 2. rsgedgesByGene()
> ## ---------------------------------------------------------------------
> edges_by_gene <- rsgedgesByGene(sg)
> edges_by_gene
GRangesList object of length 5:
$geneA 
GRanges object with 2 ranges and 5 metadata columns:
      seqnames    ranges strand |        from          to    rsgedge_id
         <Rle> <IRanges>  <Rle> | <character> <character>   <character>
  [1]     chrX [11,  50]      + |           1           3     geneA:1,3
  [2]     chrX [11, 100]      + |           1           5 geneA:1,2,4,5
      ex_or_in           tx_id
      <factor> <CharacterList>
  [1]       ex              A1
  [2]    mixed              A2

$geneB 
GRanges object with 5 ranges and 5 metadata columns:
      seqnames     ranges strand | from to rsgedge_id ex_or_in tx_id
  [1]     chrX [251, 300]      - |    1  3  geneB:1,3       ex    B1
  [2]     chrX [231, 250]      - |    3  4  geneB:3,4       in B1,B2
  [3]     chrX [201, 230]      - |    4  6  geneB:4,6       ex    B1
  [4]     chrX [251, 270]      - |    2  3  geneB:2,3       ex    B2
  [5]     chrX [216, 230]      - |    4  5  geneB:4,5       ex    B2

$geneC 
GRanges object with 5 ranges and 5 metadata columns:
      seqnames     ranges strand | from to    rsgedge_id ex_or_in tx_id
  [1]     chrX [401, 415]      + |    1  2     geneC:1,2       ex C1,C2
  [2]     chrX [416, 480]      + |    2  8   geneC:2,7,8    mixed    C1
  [3]     chrX [416, 480]      + |    2  9 geneC:2,5,6,9    mixed    C2
  [4]     chrX [481, 500]      + |    9 10    geneC:9,10       ex C2,C3
  [5]     chrX [421, 480]      + |    3  9   geneC:3,4,9    mixed    C3

...
<2 more elements>
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
> ## 'edges_by_gene' has the length and names of 'sg', that is, the names
> ## on it are the gene ids and are guaranteed to be unique.
> 
> ## Extract the reduced edges and their ranges for a given gene:
> edges_by_gene[["geneA"]]
GRanges object with 2 ranges and 5 metadata columns:
      seqnames    ranges strand |        from          to    rsgedge_id
         <Rle> <IRanges>  <Rle> | <character> <character>   <character>
  [1]     chrX [11,  50]      + |           1           3     geneA:1,3
  [2]     chrX [11, 100]      + |           1           5 geneA:1,2,4,5
      ex_or_in           tx_id
      <factor> <CharacterList>
  [1]       ex              A1
  [2]    mixed              A2
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths
> ## Note that edge with global reduced edge id "geneA:1,2,4,5" is a mixed
> ## edge obtained by combining together edges "geneA:1,2" (exon),
> ## "geneA:2,4" (intron), and "geneA:4,5" (exon), during the graph
> ## reduction.
> 
> stopifnot(identical(edges_by_gene["geneB"], rsgedgesByGene(sg["geneB"])))
> 
> ## ---------------------------------------------------------------------
> ## 3. sgedgesByTranscript()
> ## ---------------------------------------------------------------------
> #edges_by_tx <- rsgedgesByTranscript(sg)  # not ready yet!
> #edges_by_tx
> 
> ## ---------------------------------------------------------------------
> ## 4. rsgedges(), rsgraph(), uninformativeSSids()
> ## ---------------------------------------------------------------------
> plot(sgraph(sg["geneB"]))
> uninformativeSSids(sg["geneB"])
[1] "1" "6" "2" "5"
> 
> plot(rsgraph(sg["geneB"]))
> rsgedges(sg["geneB"])
DataFrame with 5 rows and 5 columns
         from          to  rsgedge_id ex_or_in           tx_id
  <character> <character> <character> <factor> <CharacterList>
1           R           3 geneB:R,1,3    mixed              B1
2           3           4   geneB:3,4       in           B1,B2
3           4           L geneB:4,6,L    mixed              B1
4           R           3 geneB:R,2,3    mixed              B2
5           4           L geneB:4,5,L    mixed              B2
> 
> ## ---------------------------------------------------------------------
> ## 5. Sanity checks
> ## ---------------------------------------------------------------------
> ## TODO: Do the same kind of sanity checks that are done for sgedges()
> ## vs sgedgesByGene() vs sgedgesByTranscript() (in man page for sgedges).
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>