Last data update: 2014.03.03

R: Find the overlapped peaks among two or more set of peaks.
findOverlapsOfPeaksR Documentation

Find the overlapped peaks among two or more set of peaks.

Description

Find the overlapping peaks for two or more (less than five) set of peak ranges.

Usage

findOverlapsOfPeaks(..., maxgap=0L, minoverlap=1L, 
                    ignore.strand=TRUE, connectedPeaks=c("min", "merge", "keepAll"))

Arguments

...

Objects of GRanges: See example below.

maxgap

Non-negative integer. Peak intervals with a separation of maxgap or less are considered to be overlapped.

minoverlap

Non-negative integer. Peak intervals with an overlapping of minoverlap or more are considered to be overlapped.

ignore.strand

When set to TRUE, the strand information is ignored in the overlap calculations.

connectedPeaks

If multiple peaks involved in overlapping in several groups, set it to "merge" will count it as 1, while set it to "min" will count it as the minimal involved peaks in any group of connected/overlapped peaks.

Details

Efficiently perform overlap queries with an interval tree implemented with GRanges.

Value

return value is An object of overlappingPeaks.

venn_cnt

an object of VennCounts

peaklist

a list consists of all overlapping peaks or unique peaks

overlappingPeaks

a list of data frame consists of the annotation of all the overlapped peaks

all.peaks

a list of GRanges object which contain the input peaks with formated rownames.

Author(s)

Jianhong Ou

References

1.Interval tree algorithm from: Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford. Introduction to Algorithms, second edition, MIT Press and McGraw-Hill. ISBN 0-262-53196-8

2.Zhu L.J. et al. (2010) ChIPpeakAnno: a Bioconductor package to annotate ChIP-seq and ChIP-chip data. BMC Bioinformatics 2010, 11:237doi:10.1186/1471-2105-11-237

3. Zhu L (2013). "Integrative analysis of ChIP-chip and ChIP-seq dataset." In Lee T and Luk ACS (eds.), Tilling Arrays, volume 1067, chapter 4, pp. -19. Humana Press. http://dx.doi.org/10.1007/978-1-62703-607-8_8, http://link.springer.com/protocol/10.1007%2F978-1-62703-607-8_8

See Also

annotatePeakInBatch, makeVennDiagram, getVennCounts, findOverlappingPeaks

Examples

peaks1 <- GRanges(seqnames=c(6,6,6,6,5),
                 IRanges(start=c(1543200,1557200,1563000,1569800,167889600),
                         end=c(1555199,1560599,1565199,1573799,167893599),
                         names=c("p1","p2","p3","p4","p5")),
                 strand="+")
peaks2 <- GRanges(seqnames=c(6,6,6,6,5),
                  IRanges(start=c(1549800,1554400,1565000,1569400,167888600),
                          end=c(1550599,1560799,1565399,1571199,167888999),
                          names=c("f1","f2","f3","f4","f5")),
                  strand="+")
t1 <- findOverlapsOfPeaks(peaks1, peaks2, maxgap=1000)
makeVennDiagram(t1)
t1$venn_cnt
t1$peaklist

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(ChIPpeakAnno)
Loading required package: grid
Loading required package: IRanges
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: Biostrings
Loading required package: XVector
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: VennDiagram
Loading required package: futile.logger

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/ChIPpeakAnno/findOverlapsOfPeaks.Rd_%03d_medium.png", width=480, height=480)
> ### Name: findOverlapsOfPeaks
> ### Title: Find the overlapped peaks among two or more set of peaks.
> ### Aliases: findOverlapsOfPeaks overlappingPeaks overlappingPeaks-class
> ### Keywords: misc
> 
> ### ** Examples
> 
> peaks1 <- GRanges(seqnames=c(6,6,6,6,5),
+                  IRanges(start=c(1543200,1557200,1563000,1569800,167889600),
+                          end=c(1555199,1560599,1565199,1573799,167893599),
+                          names=c("p1","p2","p3","p4","p5")),
+                  strand="+")
> peaks2 <- GRanges(seqnames=c(6,6,6,6,5),
+                   IRanges(start=c(1549800,1554400,1565000,1569400,167888600),
+                           end=c(1550599,1560799,1565399,1571199,167888999),
+                           names=c("f1","f2","f3","f4","f5")),
+                   strand="+")
> t1 <- findOverlapsOfPeaks(peaks1, peaks2, maxgap=1000)
> makeVennDiagram(t1)
Missing totalTest! totalTest is required for HyperG test. 
If totalTest is missing, pvalue will be calculated by estimating 
the total binding sites of encoding region of human.
totalTest = humanGenomeSize * (2%(codingDNA) + 
             1%(regulationRegion)) / ( 2 * averagePeakWidth )
          = 3.3e+9 * 0.03 / ( 2 * averagePeakWidth)
          = 5e+7 /averagePeakWidth
$p.value
     peaks1 peaks2         pval
[1,]      1      1 8.816366e-19

$vennCounts
     peaks1 peaks2 Counts
[1,]      0      0      0
[2,]      0      1      0
[3,]      1      0      0
[4,]      1      1      5
attr(,"class")
[1] "VennCounts"

> t1$venn_cnt
     peaks1 peaks2 Counts
[1,]      0      0      0
[2,]      0      1      0
[3,]      1      0      0
[4,]      1      1      5
attr(,"class")
[1] "VennCounts"
> t1$peaklist
$`peaks1///peaks2`
GRanges object with 4 ranges and 1 metadata column:
      seqnames                 ranges strand |
         <Rle>              <IRanges>  <Rle> |
  [1]        6 [  1543200,   1560799]      + |
  [2]        6 [  1563000,   1565399]      + |
  [3]        6 [  1569400,   1573799]      + |
  [4]        5 [167888600, 167893599]      + |
                                 peakNames
                           <CharacterList>
  [1] peaks1__p1,peaks2__f1,peaks2__f2,...
  [2]                peaks1__p3,peaks2__f3
  [3]                peaks2__f4,peaks1__p4
  [4]                peaks2__f5,peaks1__p5
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>