Last data update: 2014.03.03

R: Generates summaries on the specified windows
stat_aggregateR Documentation

Generates summaries on the specified windows

Description

Generates summaries on the specified windows

Usage


## S4 method for signature 'GRanges'
stat_aggregate(data, ..., xlab, ylab, main, by, FUN,
                          maxgap=0L, minoverlap=1L,
                          type=c("any", "start", "end", "within", "equal"),
                          select=c("all", "first", "last", "arbitrary"),
                          y = NULL, window = NULL, facets = NULL, 
                          method = c("mean", "median","max",
                                   "min", "sum", "count", "identity"),
                          geom = NULL)


Arguments

data

A GRanges or data.frame object.

...

Arguments passed to plot function. such as aes() and color.

xlab

Label for x

ylab

Label for y

main

Title for plot.

by

An object with 'start', 'end', and 'width' methods. Passed to aggreagate.

FUN

The function, found via 'match.fun', to be applied to each window of 'x'. Passed to aggreagate.

maxgap, minoverlap

It passed to findOverlaps.

Intervals with a separation of maxgap or less and a minimum of minoverlap overlapping positions, allowing for maxgap, are considered to be overlapping. maxgap should be a scalar, non-negative, integer. minoverlap should be a scalar, positive integer.

type

It passed to findOverlaps.

By default, any overlap is accepted. By specifying the type parameter, one can select for specific types of overlap. The types correspond to operations in Allen's Interval Algebra (see references). If type is start or end, the intervals are required to have matching starts or ends, respectively. While this operation seems trivial, the naive implementation using outer would be much less efficient. Specifying equal as the type returns the intersection of the start and end matches. If type is within, the query interval must be wholly contained within the subject interval. Note that all matches must additionally satisfy the minoverlap constraint described above.

The maxgap parameter has special meaning with the special overlap types. For start, end, and equal, it specifies the maximum difference in the starts, ends or both, respectively. For within, it is the maximum amount by which the query may be wider than the subject.

select

It passed to findOverlaps.

When select is "all" (the default), the results are returned as a Hits object. When select is "first", "last", or "arbitrary" the results are returned as an integer vector of length query containing the first, last, or arbitrary overlapping interval in subject, with NA indicating intervals that did not overlap any intervals in subject.

If select is "all", a Hits object is returned. For all other select the return value depends on the drop argument. When select != "all" && !drop, an IntegerList is returned, where each element of the result corresponds to a space in query. Whenselect != "all" && drop, an integer vector is returned containing indices that are offset to align with the unlisted query.

y

A character indicate the varialbe column for which aggregation is taken on, same as aes(y = ).

window

Integer value indicate window size.

facets

Faceting formula to use.

method

customized method for aggregating, if FUN is not provided.

geom

The geometric object to use display the data.

Value

A 'Layer'.

Author(s)

Tengfei Yin

Examples

library(GenomicRanges)
set.seed(1)
N <- 1000
## ======================================================================
##  simmulated GRanges
## ======================================================================
gr <- GRanges(seqnames = 
              sample(c("chr1", "chr2", "chr3"),
                     size = N, replace = TRUE),
              IRanges(
                      start = sample(1:300, size = N, replace = TRUE),
                      width = sample(70:75, size = N,replace = TRUE)),
              strand = sample(c("+", "-", "*"), size = N, 
                replace = TRUE),
              value = rnorm(N, 10, 3), score = rnorm(N, 100, 30),
              sample = sample(c("Normal", "Tumor"), 
                size = N, replace = TRUE),
              pair = sample(letters, size = N, 
                replace = TRUE))


ggplot(gr) + stat_aggregate(aes(y = value))
## or
## ggplot(gr) + stat_aggregate(y = "value")
ggplot(gr) + stat_aggregate(aes(y = value), window = 36)
ggplot(gr) + stat_aggregate(aes(y = value), select = "first")
## Not run: 
## no hits 
ggplot(gr) + stat_aggregate(aes(y = value), select = "first", type = "within")

## End(Not run)
ggplot(gr) + stat_aggregate(window = 30,  aes(y = value),fill = "gray40", geom = "bar")
ggplot(gr) + stat_aggregate(window = 100, fill = "gray40", aes(y = value),
                           method = "max", geom = "bar")

ggplot(gr) + stat_aggregate(aes(y = value), geom = "boxplot")
ggplot(gr) + stat_aggregate(aes(y = value), geom = "boxplot", window = 60)
## now facets need to take place inside stat_* geom_* for an accurate computation
ggplot(gr) + stat_aggregate(aes(y = value), geom = "boxplot", window = 30,
              facets = sample ~ seqnames)
## FIXME:
## autoplot(gr, stat = "aggregate", aes(y = value), window = 36)
## autoplot(gr, stat = "aggregate", geom = "boxplot", aes(y = value), window = 36)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(ggbio)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: ggplot2
Need specific help about ggbio? try mailing 
 the maintainer or visit http://tengfei.github.com/ggbio/

Attaching package: 'ggbio'

The following objects are masked from 'package:ggplot2':

    geom_bar, geom_rect, geom_segment, ggsave, stat_bin, stat_identity,
    xlim

Warning message:
replacing previous import 'ggplot2::Position' by 'BiocGenerics::Position' when loading 'ggbio' 
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/ggbio/stat_aggregate-method.Rd_%03d_medium.png", width=480, height=480)
> ### Name: stat_aggregate
> ### Title: Generates summaries on the specified windows
> ### Aliases: stat_aggregate stat_aggregate,GRanges-method
> ###   stat_aggregate,missing-method stat_aggregate,uneval-method
> 
> ### ** Examples
> 
> library(GenomicRanges)
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
> set.seed(1)
> N <- 1000
> ## ======================================================================
> ##  simmulated GRanges
> ## ======================================================================
> gr <- GRanges(seqnames = 
+               sample(c("chr1", "chr2", "chr3"),
+                      size = N, replace = TRUE),
+               IRanges(
+                       start = sample(1:300, size = N, replace = TRUE),
+                       width = sample(70:75, size = N,replace = TRUE)),
+               strand = sample(c("+", "-", "*"), size = N, 
+                 replace = TRUE),
+               value = rnorm(N, 10, 3), score = rnorm(N, 100, 30),
+               sample = sample(c("Normal", "Tumor"), 
+                 size = N, replace = TRUE),
+               pair = sample(letters, size = N, 
+                 replace = TRUE))
> 
> 
> ggplot(gr) + stat_aggregate(aes(y = value))
> ## or
> ## ggplot(gr) + stat_aggregate(y = "value")
> ggplot(gr) + stat_aggregate(aes(y = value), window = 36)
> ggplot(gr) + stat_aggregate(aes(y = value), select = "first")
> ## Not run: 
> ##D ## no hits 
> ##D ggplot(gr) + stat_aggregate(aes(y = value), select = "first", type = "within")
> ## End(Not run)
> ggplot(gr) + stat_aggregate(window = 30,  aes(y = value),fill = "gray40", geom = "bar")
> ggplot(gr) + stat_aggregate(window = 100, fill = "gray40", aes(y = value),
+                            method = "max", geom = "bar")
> 
> ggplot(gr) + stat_aggregate(aes(y = value), geom = "boxplot")
> ggplot(gr) + stat_aggregate(aes(y = value), geom = "boxplot", window = 60)
> ## now facets need to take place inside stat_* geom_* for an accurate computation
> ggplot(gr) + stat_aggregate(aes(y = value), geom = "boxplot", window = 30,
+               facets = sample ~ seqnames)
> ## FIXME:
> ## autoplot(gr, stat = "aggregate", aes(y = value), window = 36)
> ## autoplot(gr, stat = "aggregate", geom = "boxplot", aes(y = value), window = 36)
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>