Finds tuple overlaps between a GTuples or GTuplesList object,
and another object containing tuples or ranges.
NOTE: The findOverlaps generic function and
methods for Ranges and RangesList objects
are defined and documented in the IRanges package.
The methods for GenomicRanges and
GRangesList objects are defined and
documented in the GenomicRanges package.
The methods for GAlignments,
GAlignmentPairs, and
GAlignmentsList objects are defined and
documented in the GenomicAlignments package.
A GTuples or GTuplesList object.
GRanges,
GRangesList, RangesList
and RangedData are also accepted for one of
query or subject.
type
See details below.
maxgap, minoverlap
See findOverlaps in the IRanges package for
a description of these arguments. These arguments have no effect if both
query and subject are GTuples objects and
type = "equal".
select
See findOverlaps in the IRanges package for
a description of this argument.
ignore.strand
When set to TRUE, the strand information is ignored in the
overlap calculations.
Details
The findOverlaps-based methods involving genomic tuples,
either through GTuples or GTuplesList objects,
can search for tuple-tuple, tuple-range and range-tuple
overlaps. Each of these are described below, with attention paid to
the important special case of finding "equal tuple-tuple overlaps".
Equal tuple-tuple overlaps
When the query and the subject are both
GTuples objects and type = "equal",
findOverlaps uses the seqnames (seqnames), positions
(tuples,GTuples-method) and strand (strand)
to determine which tuples from the query exactly match those in
the subject, where a strand value of "*" is treated
as occuring on both the "+" and "-" strand. An overlap is
recorded when a tuple in the query and a tuple in the
subject have the same sequence name, have a compatible pairing of
strands (e.g. "+"/"+", "-"/"-",
"*"/"+", "*"/"-", etc.), and have
identical positions.
NOTE: Equal tuple-tuple overlaps can only be computed if
size(query) is equal to size(subject).
Other tuple-tuple overlaps
When the query and the subject are GTuples or
GTuplesList objects and type = "any",
"start", "end" or "within", findOverlaps
treats the tuples as if they were ranges, with ranges given by
[pos_{1}, pos_{m}] and where m is the
size,GTuples-method of
the tuples. This is done via inheritance so that a GTuples
(resp. GTuplesList) object is treated as a
GRanges (resp.
GRangesList) and the appropriate
findOverlaps method is dispatched upon.
NOTE: This is the only type of overlap finding available
when either the query and subject are
GTuplesList objects. This is following the behaviour of
findOverlaps,GRangesList,GRangesList-method
that allows type = "any", "start", "end" or
"within" but does not allow type = "equal".
tuple-range and range-tuple overlaps
When one of the query and the subject is not a
GTuples or GTuplesList objects,
findOverlaps treats the tuples as if they were ranges, with ranges
given by [pos_{1}, pos_{m}] and where m is the
size,GTuples-method of the tuples. This is done via
inheritance so that a GTuples (resp.
GTuplesList) object is treated as a
GRanges (resp.
GRangesList) and the appropriate
findOverlaps method is dispatched upon.
In the context of findOverlaps, a feature is a collection of
tuples/ranges that are treated as a single entity. For GTuples
objects, a feature is a single tuple; while for GTuplesList
objects, a feature is a list element containing a set of tuples. In the
results, the features are referred to by number, which run from 1 to
length(query)/length(subject).
Value
For findOverlaps either a Hits object when
select = "all" or an integer vector otherwise.
For countOverlaps an integer vector containing the tabulated
query overlap hits.
For overlapsAny a logical vector of length equal to the number of
tuples/ranges in query indicating those that overlap any of the
tuples/ranges in subject.
For subsetByOverlaps an object of the same class as query
containing the subset that overlapped at least one entity in subject.
For RangedData and RangesList, with the exception of
subsetByOverlaps, the results align to the unlisted
form of the object. This turns out to be fairly convenient for
RangedData (not so much for RangesList, but something
has to give).
Author(s)
Peter Hickey for methods involving GTuples and
GTuplesList. P. Aboyoun, S. Falcon, M. Lawrence,
N. Gopalakrishnan, H. Pages and H. Corrada Bravo for all the real work
underlying the powerful findOverlaps functionality.
See Also
Please see the package vignette for an extended discussion of
overlaps involving genomic tuples, which is available by typing
vignette(topic = 'GenomicTuplesIntroduction', package = 'GenomicTuples')
at the R prompt.
findOverlaps
findOverlaps
Hits-class
GTuples-class
GTuplesList-class
GRanges-class
GRangesList-class
Examples
## GTuples object containing 3-tuples:
gt3 <- GTuples(seqnames = c('chr1', 'chr1', 'chr1', 'chr1', 'chr2'),
tuples = matrix(c(10L, 10L, 10L, 10L, 10L, 20L, 20L, 20L, 25L,
20L, 30L, 30L, 35L, 30L, 30L), ncol = 3),
strand = c('+', '-', '*', '+', '+'))
## GTuplesList object
gtl3 <- GTuplesList(A = gt3[1:3], B = gt3[4:5])
## Find equal genomic tuples:
findOverlaps(gt3, gt3, type = 'equal')
## Note that this is different to the results if the tuples are treated as
## ranges since this ignores the "internal positions" (pos2):
findOverlaps(granges(gt3), granges(gt3), type = 'equal')
## Scenarios where tuples are treated as ranges:
findOverlaps(gt3, gt3, type = 'any')
findOverlaps(gt3, gt3, type = 'start')
findOverlaps(gt3, gt3, type = 'end')
findOverlaps(gt3, gt3, type = 'within')
## Overlapping a GTuples and a GTuplesList object (tuples treated as ranges):
table(!is.na(findOverlaps(gtl3, gt3, select="first")))
countOverlaps(gtl3, gt3)
findOverlaps(gtl3, gt3)
subsetByOverlaps(gtl3, gt3)
countOverlaps(gtl3, gt3, type = "start")
findOverlaps(gtl3, gt3, type = "start")
subsetByOverlaps(gtl3, gt3, type = "start")
findOverlaps(gtl3, gt3, select = "first")
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(GenomicTuples)
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/GenomicTuples/findOverlaps-methods.Rd_%03d_medium.png", width=480, height=480)
> ### Name: findOverlaps-methods
> ### Title: Finding overlapping genomic tuples
> ### Aliases: findOverlaps-methods findOverlaps
> ### findOverlaps,GTuples,GTuples-method
> ### findOverlaps,GTuplesList,GTuples-method
> ### findOverlaps,GTuples,GTuplesList-method
> ### findOverlaps,GTuplesList,GTuplesList-method
> ### findOverlaps,RangesList,GTuples-method
> ### findOverlaps,RangesList,GTuplesList-method
> ### findOverlaps,GTuples,RangesList-method
> ### findOverlaps,GTuplesList,RangesList-method
> ### findOverlaps,RangedData,GTuples-method
> ### findOverlaps,RangedData,GTuplesList-method
> ### findOverlaps,GTuples,RangedData-method
> ### findOverlaps,GTuplesList,RangedData-method
> ### findOverlaps,GTuples,GenomicRanges-method
> ### findOverlaps,GenomicRanges,GTuples-method
> ### findOverlaps,GTuplesList,GenomicRanges-method
> ### findOverlaps,GenomicRanges,GTuplesList-method
> ### findOverlaps,GTuples,GRangesList-method
> ### findOverlaps,GRangesList,GTuples-method
> ### findOverlaps,GTuplesList,GRangesList-method
> ### findOverlaps,GRangesList,GTuplesList-method countOverlaps
> ### countOverlaps,GTuples,GTuples-method
> ### countOverlaps,GTuplesList,GTuplesList-method
> ### countOverlaps,GTuples,GTuplesList-method
> ### countOverlaps,GTuplesList,GTuples-method
> ### countOverlaps,GTuples,Vector-method
> ### countOverlaps,Vector,GTuples-method
> ### countOverlaps,GTuplesList,Vector-method
> ### countOverlaps,Vector,GTuplesList-method
> ### countOverlaps,GTuples,GenomicRanges-method
> ### countOverlaps,GenomicRanges,GTuples-method
> ### countOverlaps,GTuples,GenomicRangesList-method
> ### countOverlaps,GenomicRangesList,GTuples-method
> ### countOverlaps,GTuplesList,GenomicRanges-method
> ### countOverlaps,GenomicRanges,GTuplesList-method
> ### countOverlaps,GTuplesList,GenomicRangesList-method
> ### countOverlaps,GenomicRangesList,GTuplesList-method overlapsAny
> ### overlapsAny,GTuples,GTuples-method
> ### overlapsAny,GTuplesList,GTuples-method
> ### overlapsAny,GTuples,GTuplesList-method
> ### overlapsAny,GTuplesList,GtuplesList-method
> ### overlapsAny,RangesList,GTuples-method
> ### overlapsAny,RangesList,GTuplesList-method
> ### overlapsAny,GTuples,RangesList-method
> ### overlapsAny,GTuplesList,RangesList-method
> ### overlapsAny,RangedData,GTuples-method
> ### overlapsAny,RangedData,GTuplesList-method
> ### overlapsAny,GTuples,RangedData-method
> ### overlapsAny,GTuplesList,RangedData-method
> ### overlapsAny,GTuples,GRanges-method overlapsAny,GRanges,GTuples-method
> ### overlapsAny,GTuples,GRangesList-method
> ### overlapsAny,GRangesList,GTuples-method
> ### overlapsAny,GTuplesList,GRanges-method
> ### overlapsAny,GRanges,GTuplesList-method
> ### overlapsAny,GTuplesList,GRangesList-method
> ### overlapsAny,GRangesList,GTuplesList-method subsetByOverlaps
> ### subsetByOverlaps,GTuples,GTuples-method
> ### subsetByOverlaps,GTuplesList,GTuples-method
> ### subsetByOverlaps,GTuples,GTuplesList-method
> ### subsetByOverlaps,GTuplesList,GTuplesList-method
> ### subsetByOverlaps,RangesList,GTuples-method
> ### subsetByOverlaps,RangesList,GTuplesList-method
> ### subsetByOverlaps,GTuples,RangesList-method
> ### subsetByOverlaps,GTuplesList,RangesList-method
> ### subsetByOverlaps,RangedData,GTuples-method
> ### subsetByOverlaps,RangedData,GTuplesList-method
> ### subsetByOverlaps,GTuples,RangedData-method
> ### subsetByOverlaps,GTuplesList,RangedData-method
> ### subsetByOverlaps,GTuples,GRanges-method
> ### subsetByOverlaps,GRanges,GTuples-method
> ### subsetByOverlaps,GTuples,GRangesList-method
> ### subsetByOverlaps,GRangesList,GTuples-method
> ### subsetByOverlaps,GTuplesList,GRanges-method
> ### subsetByOverlaps,GRanges,GTuplesList-method
> ### subsetByOverlaps,GTuplesList,GRangesList-method
> ### subsetByOverlaps,GRangesList,GTuplesList-method
> ### Keywords: methods utilities
>
> ### ** Examples
>
> ## GTuples object containing 3-tuples:
> gt3 <- GTuples(seqnames = c('chr1', 'chr1', 'chr1', 'chr1', 'chr2'),
+ tuples = matrix(c(10L, 10L, 10L, 10L, 10L, 20L, 20L, 20L, 25L,
+ 20L, 30L, 30L, 35L, 30L, 30L), ncol = 3),
+ strand = c('+', '-', '*', '+', '+'))
>
> ## GTuplesList object
> gtl3 <- GTuplesList(A = gt3[1:3], B = gt3[4:5])
>
> ## Find equal genomic tuples:
> findOverlaps(gt3, gt3, type = 'equal')
Hits object with 5 hits and 0 metadata columns:
queryHits subjectHits
<integer> <integer>
[1] 1 1
[2] 2 2
[3] 3 3
[4] 4 4
[5] 5 5
-------
queryLength: 5 / subjectLength: 5
> ## Note that this is different to the results if the tuples are treated as
> ## ranges since this ignores the "internal positions" (pos2):
> findOverlaps(granges(gt3), granges(gt3), type = 'equal')
Hits object with 7 hits and 0 metadata columns:
queryHits subjectHits
<integer> <integer>
[1] 1 1
[2] 1 4
[3] 2 2
[4] 3 3
[5] 4 1
[6] 4 4
[7] 5 5
-------
queryLength: 5 / subjectLength: 5
>
> ## Scenarios where tuples are treated as ranges:
> findOverlaps(gt3, gt3, type = 'any')
Hits object with 13 hits and 0 metadata columns:
queryHits subjectHits
<integer> <integer>
[1] 1 3
[2] 1 1
[3] 1 4
[4] 2 3
[5] 2 2
... ... ...
[9] 3 4
[10] 4 3
[11] 4 1
[12] 4 4
[13] 5 5
-------
queryLength: 5 / subjectLength: 5
Warning message:
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
'type' is not 'equal' so coercing 'query' and 'subject' to 'GRanges' objects (see docs for details)
> findOverlaps(gt3, gt3, type = 'start')
Hits object with 13 hits and 0 metadata columns:
queryHits subjectHits
<integer> <integer>
[1] 1 3
[2] 1 1
[3] 1 4
[4] 2 3
[5] 2 2
... ... ...
[9] 3 4
[10] 4 3
[11] 4 1
[12] 4 4
[13] 5 5
-------
queryLength: 5 / subjectLength: 5
Warning message:
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
'type' is not 'equal' so coercing 'query' and 'subject' to 'GRanges' objects (see docs for details)
> findOverlaps(gt3, gt3, type = 'end')
Hits object with 7 hits and 0 metadata columns:
queryHits subjectHits
<integer> <integer>
[1] 1 1
[2] 1 4
[3] 2 2
[4] 3 3
[5] 4 1
[6] 4 4
[7] 5 5
-------
queryLength: 5 / subjectLength: 5
Warning message:
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
'type' is not 'equal' so coercing 'query' and 'subject' to 'GRanges' objects (see docs for details)
> findOverlaps(gt3, gt3, type = 'within')
Hits object with 10 hits and 0 metadata columns:
queryHits subjectHits
<integer> <integer>
[1] 1 3
[2] 1 1
[3] 1 4
[4] 2 3
[5] 2 2
[6] 3 3
[7] 4 3
[8] 4 1
[9] 4 4
[10] 5 5
-------
queryLength: 5 / subjectLength: 5
Warning message:
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
'type' is not 'equal' so coercing 'query' and 'subject' to 'GRanges' objects (see docs for details)
>
> ## Overlapping a GTuples and a GTuplesList object (tuples treated as ranges):
> table(!is.na(findOverlaps(gtl3, gt3, select="first")))
TRUE
2
Warning message:
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
'type' is not 'equal' so coercing 'query' and 'subject' to 'GRanges' objects (see docs for details)
> countOverlaps(gtl3, gt3)
A B
4 4
Warning message:
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
'type' is not 'equal' so coercing 'query' and 'subject' to 'GRanges' objects (see docs for details)
> findOverlaps(gtl3, gt3)
Hits object with 8 hits and 0 metadata columns:
queryHits subjectHits
<integer> <integer>
[1] 1 3
[2] 1 1
[3] 1 4
[4] 1 2
[5] 2 3
[6] 2 1
[7] 2 4
[8] 2 5
-------
queryLength: 2 / subjectLength: 5
Warning message:
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
'type' is not 'equal' so coercing 'query' and 'subject' to 'GRanges' objects (see docs for details)
> subsetByOverlaps(gtl3, gt3)
GTuplesList object of length 2:
$A
GTuples object with 3 x 3-tuples and 0 metadata columns:
seqnames pos1 pos2 pos3 strand
[1] chr1 10 20 30 +
[2] chr1 10 20 30 -
[3] chr1 10 20 35 *
$B
GTuples object with 2 x 3-tuples and 0 metadata columns:
seqnames pos1 pos2 pos3 strand
[1] chr1 10 25 30 +
[2] chr2 10 20 30 +
-------
seqinfo: 2 sequences from an unspecified genome; no seqlengths
Warning message:
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
'type' is not 'equal' so coercing 'query' and 'subject' to 'GRanges' objects (see docs for details)
> countOverlaps(gtl3, gt3, type = "start")
A B
4 4
Warning message:
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
'type' is not 'equal' so coercing 'query' and 'subject' to 'GRanges' objects (see docs for details)
> findOverlaps(gtl3, gt3, type = "start")
Hits object with 8 hits and 0 metadata columns:
queryHits subjectHits
<integer> <integer>
[1] 1 3
[2] 1 1
[3] 1 4
[4] 1 2
[5] 2 3
[6] 2 1
[7] 2 4
[8] 2 5
-------
queryLength: 2 / subjectLength: 5
Warning message:
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
'type' is not 'equal' so coercing 'query' and 'subject' to 'GRanges' objects (see docs for details)
> subsetByOverlaps(gtl3, gt3, type = "start")
GTuplesList object of length 2:
$A
GTuples object with 3 x 3-tuples and 0 metadata columns:
seqnames pos1 pos2 pos3 strand
[1] chr1 10 20 30 +
[2] chr1 10 20 30 -
[3] chr1 10 20 35 *
$B
GTuples object with 2 x 3-tuples and 0 metadata columns:
seqnames pos1 pos2 pos3 strand
[1] chr1 10 25 30 +
[2] chr2 10 20 30 +
-------
seqinfo: 2 sequences from an unspecified genome; no seqlengths
Warning message:
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
'type' is not 'equal' so coercing 'query' and 'subject' to 'GRanges' objects (see docs for details)
> findOverlaps(gtl3, gt3, select = "first")
[1] 1 1
Warning message:
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
'type' is not 'equal' so coercing 'query' and 'subject' to 'GRanges' objects (see docs for details)
>
>
>
>
>
> dev.off()
null device
1
>