Last data update: 2014.03.03

R: Calculating lengths of features
lengthOfR Documentation

Calculating lengths of features

Description

These methods allow to calculate the lengths of features (transcripts, genes, CDS, 3' or 5' UTRs) defined in an EnsDb object or database.

Usage


## S4 method for signature 'EnsDb'
lengthOf(x, of="gene", filter=list())

Arguments

(In alphabetic order)

filter

list of BasicFilter instance(s) to select specific entries from the database (see examples below).

of

for lengthOf: whether the length of genes or transcripts should be retrieved from the database.

x

For lengthOf: either an EnsDb or a GRangesList object. For all other methods an EnsDb instance.

Value

For lengthOf: see method description above.

Methods and Functions

lengthOf

Retrieve the length of genes or transcripts from the database. The length is the sum of the lengths of all exons of a transcript or a gene. In the latter case the exons are first reduced so that the length corresponds to the part of the genomic sequence covered by the exons.

Note: in addition to this method, also the transcriptLengths function in the GenomicFeatures package can be used.

Author(s)

Johannes Rainer

See Also

exonsBy transcripts transcriptLengths

Examples


library(EnsDb.Hsapiens.v75)
edb <- EnsDb.Hsapiens.v75

#####    lengthOf
##
## length of a specific gene.
lengthOf(edb,
         filter=list(GeneidFilter("ENSG00000000003")))

## length of a transcript
lengthOf(edb, of="tx",
         filter=list(TxidFilter("ENST00000494424")))

## average length of all protein coding genes encoded on chromosomes X
## and Y
mean(lengthOf(edb, of="gene",
              filter=list(GenebiotypeFilter("protein_coding"),
                  SeqnameFilter(c("X", "Y")))))

## average length of all snoRNAs
mean(lengthOf(edb, of="gene",
              filter=list(GenebiotypeFilter("snoRNA"),
                  SeqnameFilter(c("X", "Y")))))

##### transcriptLengths
##
## Calculate the length of transcripts encoded on chromosome Y, including
## length of the CDS, 5' and 3' UTR.
##len <- transcriptLengths(edb, with.cds_len=TRUE, with.utr5_len=TRUE,
##                         with.utr3_len=TRUE, filter=SeqnameFilter("Y"))
##head(len)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(ensembldb)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: GenomicRanges
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicFeatures
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/ensembldb/EnsDb-lengths.Rd_%03d_medium.png", width=480, height=480)
> ### Name: lengthOf
> ### Title: Calculating lengths of features
> ### Aliases: lengthOf lengthOf,GRangesList-method lengthOf,EnsDb-method
> ### Keywords: classes
> 
> ### ** Examples
> 
> 
> library(EnsDb.Hsapiens.v75)
> edb <- EnsDb.Hsapiens.v75
> 
> #####    lengthOf
> ##
> ## length of a specific gene.
> lengthOf(edb,
+          filter=list(GeneidFilter("ENSG00000000003")))
ENSG00000000003 
           2968 
> 
> ## length of a transcript
> lengthOf(edb, of="tx",
+          filter=list(TxidFilter("ENST00000494424")))
ENST00000494424 
            820 
> 
> ## average length of all protein coding genes encoded on chromosomes X
> ## and Y
> mean(lengthOf(edb, of="gene",
+               filter=list(GenebiotypeFilter("protein_coding"),
+                   SeqnameFilter(c("X", "Y")))))
[1] 3715.94
> 
> ## average length of all snoRNAs
> mean(lengthOf(edb, of="gene",
+               filter=list(GenebiotypeFilter("snoRNA"),
+                   SeqnameFilter(c("X", "Y")))))
[1] 119.3333
> 
> ##### transcriptLengths
> ##
> ## Calculate the length of transcripts encoded on chromosome Y, including
> ## length of the CDS, 5' and 3' UTR.
> ##len <- transcriptLengths(edb, with.cds_len=TRUE, with.utr5_len=TRUE,
> ##                         with.utr3_len=TRUE, filter=SeqnameFilter("Y"))
> ##head(len)
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>