Last data update: 2014.03.03

R: Calculate summary statistics on views of an RleDataFrame
RleDataFrame-viewsR Documentation

Calculate summary statistics on views of an RleDataFrame

Description

These methods mirror the viewMeans type functions from IRanges for SimpleRleList. They differ in that they work on an RleDataFrame and an IRanges directly and also have a simplify argument. This works out to be faster (compute-wise) and also convenient.

Still, an RleDataFrame inherits from SimpleRleList, so all of the views functions will work.

Usage

rangeSums(x, bounds, na.rm=FALSE, simplify=TRUE)
rangeMeans(x, bounds, na.rm=FALSE, simplify=TRUE, ...)
rangeMins(x, bounds, na.rm=FALSE, simplify=TRUE)
rangeMaxs(x, bounds, na.rm=FALSE, simplify=TRUE)
rangeWhichMins(x, bounds, na.rm=FALSE, simplify=TRUE)
rangeWhichMaxs(x, bounds, na.rm=FALSE, simplify=TRUE)

Arguments

x

RleDataFrame

bounds

Matrix with two columns or IRanges representing ranges of rows of x to process. If bounds is a matrix, an IRanges is constructed assuming the first two columns represent the start and end of the ranges. The names for the IRanges is taken from the rownames of the matrix. Such a matrix can constructed with boundingIndicesByChr and is the preferred input.

na.rm

Scalar logical. Ignore NAs in calculations?

simplify

Scalar logical. Simplify result? If TRUE, the return value will be a vector or matrix. For a single view, a vector will be returned. Otherwise a matrix with one row per view and one column per column of x will be returned. If FALSE, the return value will be a list of length ncol(x) of vectors of length nrow(bounds).

...

Additional arguments for other methods.

Details

The "range" name prefixes here serve to differentiate these functions from the "view" functions. This may change. I will be asking the IRanges team to add "..." and "simplify" to the "view" methods so that I can just make additional methods for RleDataFrame.

Value

With simplify == TRUE, a vector for single view or a matrix otherwise. When simplify == FALSE, a list of vectors length ncol(x) where each element is of length nrows(bounds).

See Also

RleDataFrame boundingIndicesByChr

Examples

df = RleDataFrame(list(a=Rle(1:5, rep(2, 5))), b=Rle(1:5, rep(2, 5)),
row.names=LETTERS[1:10])
mat = matrix(c(1,4,3,5),ncol=2,dimnames=list(c("Gene1","Gene2"),c("start","end")))
bounds = IRanges(start=c(1, 4), end=c(3, 5), names=c("Gene1","Gene2"))

rangeMeans(df,bounds,simplify=FALSE)
rangeMeans(df,bounds,simplify=TRUE)
rangeMeans(df,mat,simplify=TRUE)

rangeMeans(df,bounds)
rangeSums(df,bounds)
rangeMins(df,bounds)
rangeMaxs(df,bounds)
rangeWhichMins(df,bounds)
rangeWhichMaxs(df,bounds)

# RleDataFrame isa SimpleRleList, so all the IRanges view* methods work too:
v = RleViewsList( lapply( df, Views, start=bounds ) )
viewMeans(v)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(genoset)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: GenomicRanges
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.


*** Genoset API Changes ***
The genoset class has transitioned to the
RangedSummarizedExperiment API from the eSet API (e.g. use colnames instead of sampleNames). ***

Attaching package: 'genoset'

The following object is masked from 'package:GenomicRanges':

    pos

The following objects are masked from 'package:S4Vectors':

    colMeans, colSums, rowMeans, rowSums

The following objects are masked from 'package:base':

    colMeans, colSums, rowMeans, rowSums

Warning messages:
1: multiple methods tables found for 'colMeans' 
2: multiple methods tables found for 'colSums' 
3: multiple methods tables found for 'rowMeans' 
4: multiple methods tables found for 'rowSums' 
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/genoset/RleDataFrame-views.Rd_%03d_medium.png", width=480, height=480)
> ### Name: RleDataFrame-views
> ### Title: Calculate summary statistics on views of an RleDataFrame
> ### Aliases: RleDataFrame-views rangeSums rangeMeans rangeMins rangeMaxs
> ###   rangeWhichMins rangeWhichMaxs rangeSums,RleDataFrame-method
> ###   rangeMeans,RleDataFrame-method rangeMeans,numeric-method
> ###   rangeMeans,matrix-method rangeMins,RleDataFrame-method
> ###   rangeMaxs,RleDataFrame-method rangeWhichMins,RleDataFrame-method
> ###   rangeWhichMaxs,RleDataFrame-method rangeColMeans
> ### Keywords: methods
> 
> ### ** Examples
> 
> df = RleDataFrame(list(a=Rle(1:5, rep(2, 5))), b=Rle(1:5, rep(2, 5)),
+ row.names=LETTERS[1:10])
> mat = matrix(c(1,4,3,5),ncol=2,dimnames=list(c("Gene1","Gene2"),c("start","end")))
> bounds = IRanges(start=c(1, 4), end=c(3, 5), names=c("Gene1","Gene2"))
> 
> rangeMeans(df,bounds,simplify=FALSE)
$a
Gene1 Gene2 
   NA   2.5 

$b
Gene1 Gene2 
   NA   2.5 

> rangeMeans(df,bounds,simplify=TRUE)
        a   b
Gene1  NA  NA
Gene2 2.5 2.5
> rangeMeans(df,mat,simplify=TRUE)
        a   b
Gene1  NA  NA
Gene2 2.5 2.5
> 
> rangeMeans(df,bounds)
        a   b
Gene1  NA  NA
Gene2 2.5 2.5
> rangeSums(df,bounds)
      a b
Gene1 4 4
Gene2 5 5
> rangeMins(df,bounds)
      a b
Gene1 1 1
Gene2 2 2
> rangeMaxs(df,bounds)
      a b
Gene1 2 2
Gene2 3 3
> rangeWhichMins(df,bounds)
      a b
Gene1 1 1
Gene2 4 4
> rangeWhichMaxs(df,bounds)
      a b
Gene1 3 3
Gene2 5 5
> 
> # RleDataFrame isa SimpleRleList, so all the IRanges view* methods work too:
> v = RleViewsList( lapply( df, Views, start=bounds ) )
> viewMeans(v)
NumericList of length 2
[["a"]] Gene1=1.33333333333333 Gene2=2.5
[["b"]] Gene1=1.33333333333333 Gene2=2.5
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>