Last data update: 2014.03.03

R: Consolidate DB clusters
consolidateClustersR Documentation

Consolidate DB clusters

Description

Consolidate DB results from multiple analyses with cluster-level FDR control.

Usage

consolidateClusters(data.list, result.list, equiweight=TRUE, ...)

Arguments

data.list

a list of RangedSummarizedExperiment and/or GRanges objects

result.list

a list of data frames containing the DB test results for each entry of data.list

equiweight

a logical scalar indicating whether equal weighting from each analysis should be enforced

...

arguments to be passed to clusterWindows

Details

This function consolidates DB results from multiple analyses, typically involving different window sizes. The aim is to provide comprehensive detection of DB at a range of spatial resolutions. Significant windows from each analysis are identified and used for clustering with clusterWindows. This represents the post-hoc counterpart to consolidateSizes.

Some effort is required to equalize the contribution of the results from each analysis. This is done by setting equiweight=TRUE, where the weight of each window is inversely proportional to the number of windows from that analysis. These weights are used as frequency weights for window-level FDR control (to identify DB windows prior to clustering). Otherwise, the final results would be dominated by large number of small windows.

Users can cluster by the sign of log-fold changes, to obtain clusters of DB windows of the same sign. However, note that nested windows with opposite signs may give unintuitive results - see mergeWindows for details.

Value

A named list is returned containing:

id

a list of integer vectors indicating the cluster ID for each window in data.list

region

a GRanges object containing the coordinates for each cluster

FDR

a numeric scalar containing the cluster-level FDR estimate

Author(s)

Aaron Lun

See Also

clusterWindows, consolidateSizes

Examples

# Making up some GRanges.
low <- GRanges("chrA", IRanges(runif(100, 1, 1000), width=5))
med <- GRanges("chrA", IRanges(runif(40, 1, 1000), width=10))
high <- GRanges("chrA", IRanges(runif(10, 1, 1000), width=20))

# Making up some DB results.
dbl <- data.frame(logFC=rnorm(length(low)), PValue=rbeta(length(low), 1, 20))
dbm <- data.frame(logFC=rnorm(length(med)), PValue=rbeta(length(med), 1, 20))
dbh <- data.frame(logFC=rnorm(length(high)), PValue=rbeta(length(high), 1, 20))
result.list <- list(dbl, dbm, dbh)

# Consolidating.
cons <- consolidateClusters(list(low, med, high), result.list, tol=20)
cons$region
cons$id
cons$FDR

# Without weights.
cons <- consolidateClusters(list(low, med, high), result.list, tol=20, equiweight=FALSE)
cons$FDR

# Using the signs.
cons <- consolidateClusters(list(low, med, high), result.list, tol=20, fc.col="logFC")

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(csaw)
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/csaw/consolidateClusters.Rd_%03d_medium.png", width=480, height=480)
> ### Name: consolidateClusters
> ### Title: Consolidate DB clusters
> ### Aliases: consolidateClusters
> ### Keywords: clustering
> 
> ### ** Examples
> 
> # Making up some GRanges.
> low <- GRanges("chrA", IRanges(runif(100, 1, 1000), width=5))
> med <- GRanges("chrA", IRanges(runif(40, 1, 1000), width=10))
> high <- GRanges("chrA", IRanges(runif(10, 1, 1000), width=20))
> 
> # Making up some DB results.
> dbl <- data.frame(logFC=rnorm(length(low)), PValue=rbeta(length(low), 1, 20))
> dbm <- data.frame(logFC=rnorm(length(med)), PValue=rbeta(length(med), 1, 20))
> dbh <- data.frame(logFC=rnorm(length(high)), PValue=rbeta(length(high), 1, 20))
> result.list <- list(dbl, dbm, dbh)
> 
> # Consolidating.
> cons <- consolidateClusters(list(low, med, high), result.list, tol=20)
Warning message:
In clusterWindows(all.data, all.result, weight = weights, ...) :
  unspecified 'target' for the cluster-level FDR set to 0.05
> cons$region
GRanges object with 2 ranges and 0 metadata columns:
      seqnames      ranges strand
         <Rle>   <IRanges>  <Rle>
  [1]     chrA [  3,  199]      *
  [2]     chrA [238, 1002]      *
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths
> cons$id
[[1]]
  [1] 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 1 1 2 1 2 1 2 1 2 2 1 2 1 2 2 2 2 2 2 2
 [38] 2 2 2 2 1 2 2 2 2 2 2 2 1 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [75] 1 2 2 2 2 2 2 2 1 1 2 2 2 1 2 2 1 2 2 1 2 1 2 1 2 2

[[2]]
 [1] 1 2 2 2 2 2 1 2 1 1 1 2 1 2 2 2 2 2 2 1 2 2 1 1 1 2 1 2 2 2 2 1 2 2 2 2 1 2
[39] 2 2

[[3]]
 [1] 2 1 2 2 2 2 2 2 2 1

> cons$FDR
[1] 0
> 
> # Without weights.
> cons <- consolidateClusters(list(low, med, high), result.list, tol=20, equiweight=FALSE)
Warning message:
In clusterWindows(all.data, all.result, weight = weights, ...) :
  unspecified 'target' for the cluster-level FDR set to 0.05
> cons$FDR
[1] 0.09090909
> 
> # Using the signs.
> cons <- consolidateClusters(list(low, med, high), result.list, tol=20, fc.col="logFC")
There were 50 or more warnings (use warnings() to see the first 50)
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>