Last data update: 2014.03.03

R: Put bin pairs into boxes
boxPairsR Documentation

Put bin pairs into boxes

Description

Match smaller bin pairs to the larger bin pairs in which they are nested.

Usage

boxPairs(..., reference, minbox=FALSE, index.only=FALSE)

Arguments

...

One or more named InteractionSet objects produced by squareCounts, with smaller bin sizes than reference.

reference

An integer scalar specifying the reference bin size.

minbox

A logical scalar indicating whether coordinates for the minimum bounding box should be returned.

index.only

A logical scalar indicating whether only indices should be returned.

Details

Consider the bin size specified in reference. Pairs of these bins are referred to here as the parent bin pairs, and are described in the output pairs and region. The function accepts a number of InteractionSet objects of bin pair data in the ellipsis, referred to here as input bin pairs. The aim is to identify the parent bin pair in which each input bin pair is nested.

All input InteractionSet objects in the ellipsis must be constructed carefully. In particular, the value of width in squareCounts must be such that reference is an exact multiple of each width. This is necessary to ensure complete nesting. Otherwise, the behavior of the function will not be clearly defined.

In the output, one vector will be present in indices for each input InteractionSet in the ellipsis. In each vector, each entry represents an index for a single input bin pair in the corresponding InteractionSet. This index points to the entries in interactions that specify the coordinates of the parent bin pair. Thus, bin pairs with the same index are nested in the same parent.

Some users may wish to identify bin pairs in one InteractionSet that are nested within bin pairs in another InteractionSet. This can be done by supplying both InteractionSet objects in the ellipsis, and leaving reference unspecified. The value of reference will be automatically selected as the largest width of the supplied InteractionSet objects. Nesting can be identified by matching the output indices for the smaller bin pairs to those of the larger bin pairs.

If minbox=TRUE, the coordinates in interactions represent the minimum bounding box for all nested bin pairs in each parent. This may be more precise if nesting only occurs in a portion of the interaction space of the parent bin pair.

If index.only=TRUE, only the indices are returned and coordinates are not computed. This is largely for efficiency purposes when boxPairs is called by internal functions.

Value

If index.only=FALSE, a named list is returned containing:

indices:

a named list of integer vectors for every InteractionSet in the ellipsis, see Details.

interactions:

A ReverseStrictGInteractions object containing the coordinates of the parent bin pair or, if minbox=TRUE, the minimum bounding box.

If index.only=TRUE, the indices are returned directly without computing coordinates.

Author(s)

Aaron Lun

See Also

squareCounts, clusterPairs

Examples

# Setting up the objects.
a <- 10
b <- 20
cuts <- GRanges(rep(c("chrA", "chrB"), c(a, b)), IRanges(c(1:a, 1:b), c(1:a, 1:b)))
param <- pairParam(cuts)

all.combos <- combn(length(cuts), 2) # Bin size of 1.
y <- InteractionSet(matrix(0, ncol(all.combos), 1), 
    GInteractions(anchor1=all.combos[2,], anchor2=all.combos[1,], regions=cuts, mode="reverse"),
    colData=DataFrame(lib.size=1000), metadata=List(param=param, width=1))

a5 <- a/5
b5 <- b/5
all.combos2 <- combn(length(cuts)/5, 2) # Bin size of 5.
y2 <- InteractionSet(matrix(0, ncol(all.combos2), 1), 
    GInteractions(anchor1=all.combos2[2,], anchor2=all.combos2[1,], 
    	regions=GRanges(rep(c("chrA", "chrB"), c(a5, b5)), 
    		IRanges(c((1:a5-1)*5+1, (1:b5-1)*5+1), c(1:a5*5, 1:b5*5))), mode="reverse"),
    colData=DataFrame(lib.size=1000), metadata=List(param=param, width=5))

# Clustering.
boxPairs(reference=5, larger=y2, smaller=y)
boxPairs(reference=10, larger=y2, smaller=y)
boxPairs(reference=10, larger=y2, smaller=y, minbox=TRUE)
boxPairs(larger=y2, smaller=y)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(diffHic)
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: InteractionSet
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/diffHic/boxPairs.Rd_%03d_medium.png", width=480, height=480)
> ### Name: boxPairs
> ### Title: Put bin pairs into boxes
> ### Aliases: boxPairs
> ### Keywords: clustering
> 
> ### ** Examples
> 
> # Setting up the objects.
> a <- 10
> b <- 20
> cuts <- GRanges(rep(c("chrA", "chrB"), c(a, b)), IRanges(c(1:a, 1:b), c(1:a, 1:b)))
> param <- pairParam(cuts)
> 
> all.combos <- combn(length(cuts), 2) # Bin size of 1.
> y <- InteractionSet(matrix(0, ncol(all.combos), 1), 
+     GInteractions(anchor1=all.combos[2,], anchor2=all.combos[1,], regions=cuts, mode="reverse"),
+     colData=DataFrame(lib.size=1000), metadata=List(param=param, width=1))
> 
> a5 <- a/5
> b5 <- b/5
> all.combos2 <- combn(length(cuts)/5, 2) # Bin size of 5.
> y2 <- InteractionSet(matrix(0, ncol(all.combos2), 1), 
+     GInteractions(anchor1=all.combos2[2,], anchor2=all.combos2[1,], 
+     	regions=GRanges(rep(c("chrA", "chrB"), c(a5, b5)), 
+     		IRanges(c((1:a5-1)*5+1, (1:b5-1)*5+1), c(1:a5*5, 1:b5*5))), mode="reverse"),
+     colData=DataFrame(lib.size=1000), metadata=List(param=param, width=5))
> 
> # Clustering.
> boxPairs(reference=5, larger=y2, smaller=y)
$indices
$indices$larger
 [1]  2  4  7 11 16  5  8 12 17  9 13 18 14 19 20

$indices$smaller
  [1]  1  1  1  1  2  2  2  2  2  4  4  4  4  4  7  7  7  7  7 11 11 11 11 11 16
 [26] 16 16 16 16  1  1  1  2  2  2  2  2  4  4  4  4  4  7  7  7  7  7 11 11 11
 [51] 11 11 16 16 16 16 16  1  1  2  2  2  2  2  4  4  4  4  4  7  7  7  7  7 11
 [76] 11 11 11 11 16 16 16 16 16  1  2  2  2  2  2  4  4  4  4  4  7  7  7  7  7
[101] 11 11 11 11 11 16 16 16 16 16  2  2  2  2  2  4  4  4  4  4  7  7  7  7  7
[126] 11 11 11 11 11 16 16 16 16 16  3  3  3  3  5  5  5  5  5  8  8  8  8  8 12
[151] 12 12 12 12 17 17 17 17 17  3  3  3  5  5  5  5  5  8  8  8  8  8 12 12 12
[176] 12 12 17 17 17 17 17  3  3  5  5  5  5  5  8  8  8  8  8 12 12 12 12 12 17
[201] 17 17 17 17  3  5  5  5  5  5  8  8  8  8  8 12 12 12 12 12 17 17 17 17 17
[226]  5  5  5  5  5  8  8  8  8  8 12 12 12 12 12 17 17 17 17 17  6  6  6  6  9
[251]  9  9  9  9 13 13 13 13 13 18 18 18 18 18  6  6  6  9  9  9  9  9 13 13 13
[276] 13 13 18 18 18 18 18  6  6  9  9  9  9  9 13 13 13 13 13 18 18 18 18 18  6
[301]  9  9  9  9  9 13 13 13 13 13 18 18 18 18 18  9  9  9  9  9 13 13 13 13 13
[326] 18 18 18 18 18 10 10 10 10 14 14 14 14 14 19 19 19 19 19 10 10 10 14 14 14
[351] 14 14 19 19 19 19 19 10 10 14 14 14 14 14 19 19 19 19 19 10 14 14 14 14 14
[376] 19 19 19 19 19 14 14 14 14 14 19 19 19 19 19 15 15 15 15 20 20 20 20 20 15
[401] 15 15 20 20 20 20 20 15 15 20 20 20 20 20 15 20 20 20 20 20 20 20 20 20 20
[426] 21 21 21 21 21 21 21 21 21 21


$interactions
ReverseStrictGInteractions object with 21 interactions and 0 metadata columns:
       seqnames1   ranges1     seqnames2   ranges2
           <Rle> <IRanges>         <Rle> <IRanges>
   [1]      chrA   [1,  5] ---      chrA   [1,  5]
   [2]      chrA   [6, 10] ---      chrA   [1,  5]
   [3]      chrA   [6, 10] ---      chrA   [6, 10]
   [4]      chrB   [1,  5] ---      chrA   [1,  5]
   [5]      chrB   [1,  5] ---      chrA   [6, 10]
   ...       ...       ... ...       ...       ...
  [17]      chrB  [16, 20] ---      chrA  [ 6, 10]
  [18]      chrB  [16, 20] ---      chrB  [ 1,  5]
  [19]      chrB  [16, 20] ---      chrB  [ 6, 10]
  [20]      chrB  [16, 20] ---      chrB  [11, 15]
  [21]      chrB  [16, 20] ---      chrB  [16, 20]
  -------
  regions: 6 ranges and 1 metadata column
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

> boxPairs(reference=10, larger=y2, smaller=y)
$indices
$indices$larger
 [1] 1 2 2 4 4 2 2 4 4 3 5 5 5 5 6

$indices$smaller
  [1] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1
 [38] 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
 [75] 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 1
[112] 1 1 1 1 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 1 1 1 1 2 2 2 2 2 2 2 2 2
[149] 2 4 4 4 4 4 4 4 4 4 4 1 1 1 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 1 1 2
[186] 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 1 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4
[223] 4 4 4 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 3 5 5 5 5 5
[260] 5 5 5 5 5 3 3 3 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 3 3 5 5 5 5 5 5 5
[297] 5 5 5 3 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 3 3 3
[334] 3 5 5 5 5 5 5 5 5 5 5 3 3 3 5 5 5 5 5 5 5 5 5 5 3 3 5 5 5 5 5 5 5 5 5 5 3
[371] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
[408] 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6


$interactions
ReverseStrictGInteractions object with 6 interactions and 0 metadata columns:
      seqnames1   ranges1     seqnames2   ranges2
          <Rle> <IRanges>         <Rle> <IRanges>
  [1]      chrA  [ 1, 10] ---      chrA  [ 1, 10]
  [2]      chrB  [ 1, 10] ---      chrA  [ 1, 10]
  [3]      chrB  [ 1, 10] ---      chrB  [ 1, 10]
  [4]      chrB  [11, 20] ---      chrA  [ 1, 10]
  [5]      chrB  [11, 20] ---      chrB  [ 1, 10]
  [6]      chrB  [11, 20] ---      chrB  [11, 20]
  -------
  regions: 3 ranges and 1 metadata column
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

> boxPairs(reference=10, larger=y2, smaller=y, minbox=TRUE)
$indices
$indices$larger
 [1] 1 2 2 4 4 2 2 4 4 3 5 5 5 5 6

$indices$smaller
  [1] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1
 [38] 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
 [75] 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 1
[112] 1 1 1 1 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 1 1 1 1 2 2 2 2 2 2 2 2 2
[149] 2 4 4 4 4 4 4 4 4 4 4 1 1 1 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 1 1 2
[186] 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 1 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4
[223] 4 4 4 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 3 5 5 5 5 5
[260] 5 5 5 5 5 3 3 3 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 3 3 5 5 5 5 5 5 5
[297] 5 5 5 3 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 3 3 3
[334] 3 5 5 5 5 5 5 5 5 5 5 3 3 3 5 5 5 5 5 5 5 5 5 5 3 3 5 5 5 5 5 5 5 5 5 5 3
[371] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
[408] 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6


$interactions
ReverseStrictGInteractions object with 6 interactions and 0 metadata columns:
      seqnames1   ranges1     seqnames2   ranges2
          <Rle> <IRanges>         <Rle> <IRanges>
  [1]      chrA  [ 2, 10] ---      chrA  [ 1,  9]
  [2]      chrB  [ 1, 10] ---      chrA  [ 1, 10]
  [3]      chrB  [ 2, 10] ---      chrB  [ 1,  9]
  [4]      chrB  [11, 20] ---      chrA  [ 1, 10]
  [5]      chrB  [11, 20] ---      chrB  [ 1, 10]
  [6]      chrB  [12, 20] ---      chrB  [11, 19]
  -------
  regions: 9 ranges and 0 metadata columns
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

> boxPairs(larger=y2, smaller=y)
$indices
$indices$larger
 [1]  2  4  7 11 16  5  8 12 17  9 13 18 14 19 20

$indices$smaller
  [1]  1  1  1  1  2  2  2  2  2  4  4  4  4  4  7  7  7  7  7 11 11 11 11 11 16
 [26] 16 16 16 16  1  1  1  2  2  2  2  2  4  4  4  4  4  7  7  7  7  7 11 11 11
 [51] 11 11 16 16 16 16 16  1  1  2  2  2  2  2  4  4  4  4  4  7  7  7  7  7 11
 [76] 11 11 11 11 16 16 16 16 16  1  2  2  2  2  2  4  4  4  4  4  7  7  7  7  7
[101] 11 11 11 11 11 16 16 16 16 16  2  2  2  2  2  4  4  4  4  4  7  7  7  7  7
[126] 11 11 11 11 11 16 16 16 16 16  3  3  3  3  5  5  5  5  5  8  8  8  8  8 12
[151] 12 12 12 12 17 17 17 17 17  3  3  3  5  5  5  5  5  8  8  8  8  8 12 12 12
[176] 12 12 17 17 17 17 17  3  3  5  5  5  5  5  8  8  8  8  8 12 12 12 12 12 17
[201] 17 17 17 17  3  5  5  5  5  5  8  8  8  8  8 12 12 12 12 12 17 17 17 17 17
[226]  5  5  5  5  5  8  8  8  8  8 12 12 12 12 12 17 17 17 17 17  6  6  6  6  9
[251]  9  9  9  9 13 13 13 13 13 18 18 18 18 18  6  6  6  9  9  9  9  9 13 13 13
[276] 13 13 18 18 18 18 18  6  6  9  9  9  9  9 13 13 13 13 13 18 18 18 18 18  6
[301]  9  9  9  9  9 13 13 13 13 13 18 18 18 18 18  9  9  9  9  9 13 13 13 13 13
[326] 18 18 18 18 18 10 10 10 10 14 14 14 14 14 19 19 19 19 19 10 10 10 14 14 14
[351] 14 14 19 19 19 19 19 10 10 14 14 14 14 14 19 19 19 19 19 10 14 14 14 14 14
[376] 19 19 19 19 19 14 14 14 14 14 19 19 19 19 19 15 15 15 15 20 20 20 20 20 15
[401] 15 15 20 20 20 20 20 15 15 20 20 20 20 20 15 20 20 20 20 20 20 20 20 20 20
[426] 21 21 21 21 21 21 21 21 21 21


$interactions
ReverseStrictGInteractions object with 21 interactions and 0 metadata columns:
       seqnames1   ranges1     seqnames2   ranges2
           <Rle> <IRanges>         <Rle> <IRanges>
   [1]      chrA   [1,  5] ---      chrA   [1,  5]
   [2]      chrA   [6, 10] ---      chrA   [1,  5]
   [3]      chrA   [6, 10] ---      chrA   [6, 10]
   [4]      chrB   [1,  5] ---      chrA   [1,  5]
   [5]      chrB   [1,  5] ---      chrA   [6, 10]
   ...       ...       ... ...       ...       ...
  [17]      chrB  [16, 20] ---      chrA  [ 6, 10]
  [18]      chrB  [16, 20] ---      chrB  [ 1,  5]
  [19]      chrB  [16, 20] ---      chrB  [ 6, 10]
  [20]      chrB  [16, 20] ---      chrB  [11, 15]
  [21]      chrB  [16, 20] ---      chrB  [16, 20]
  -------
  regions: 6 ranges and 1 metadata column
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>