These functions support the import and export of the UCSC BigWig
format, a compressed, binary form of WIG/BEDGraph with a spatial index
and precomputed summaries. These functions do not work on Windows.
Usage
## S4 method for signature 'BigWigFile,ANY,ANY'
import(con, format, text,
selection = BigWigSelection(which, ...),
which = con, asRle = FALSE,
as = c("GRanges", "RleList", "NumericList"), ...)
import.bw(con, ...)
## S4 method for signature 'ANY,BigWigFile,ANY'
export(object, con, format, ...)
## S4 method for signature 'GenomicRanges,BigWigFile,ANY'
export(object, con, format,
dataFormat = c("auto", "variableStep", "fixedStep",
"bedGraph"), compress = TRUE, fixedSummaries = FALSE)
export.bw(object, con, ...)
Arguments
con
A path, URL or BigWigFile object. Connections are
not supported. For the functions ending in .bw, the file
format is indicated by the function name. For the export
and import methods, the format must be indicated another
way. If con is a path, or URL, either the file
extension or the format argument needs to be “bigWig”
or “bw”.
object
The object to export, should be an RleList,
IntegerList, NumericList,
GRanges or something coercible to a GRanges.
format
If not missing, should be “bigWig” or “bw”
(case insensitive).
text
Not supported.
as
Specifies the class of the return object. Default is
GRanges, which has one range per range in the file, and a
score column holding the value for each range. For
NumericList, one numeric vector is returned for each range in
the selection argument. For RleList, there is one
Rle per sequence, and that Rle spans the entire
sequence.
asRle
Deprecated. Use as instead.
If TRUE, the BigWig file is assumed to contain
contiguous ranges that define a run-length encoding of a
vector (like coverage), and a RleList is returned.
selection
A BigWigSelection object
indicating the ranges to load.
which
A range data structure coercible to RangesList,
like a GRanges, or a BigWigFile. Only the intervals in
the file overlapping the given ranges are returned. By default, the
value is the BigWigFile itself. Its Seqinfo object is
extracted and coerced to a RangesList that represents the
entirety of the file.
dataFormat
Probably best left to “auto”. Exists only
for historical reasons.
compress
If TRUE, compress the data. No reason to change this.
fixedSummaries
If TRUE, compute summaries at fixed
resolutions corresponding to the default zoom levels in the Ensembl
genome browser (with some extrapolation): 30X, 65X, 130X, 260X,
450X, 648X, 950X, 1296X, 4800X, 19200X. Otherwise, the resolutions
are dynamically determined by an algorithm that computes an initial
summary size by initializing to 10X the size of the smallest feature
and doubling the size as needed until the size of the summary is
less than half that of the data (or there are no further gains). It
then computes up to 10 more levels of summary, quadrupling the size
each time, until the summaries start to exceed the sequence size.
...
Arguments to pass down to methods to other methods. For
import, the flow eventually reaches the BigWigFile method on
import.
Value
A GRanges (default), RleList or NumericList.
GRanges return ranges with non-zero score values in a score
metadata column. The length of the NumericList is the same length
as the selection argument (one list element per range).
The return order in the NumericList matches the order of the
BigWigSelection object.
BigWigFile objects
A BigWigFile object, an extension of
RTLFile is a reference to a BigWig file. To cast
a path, URL or connection to a BigWigFile, pass it to the
BigWigFile constructor.
BigWig files are more complex than most track files, and there are a
number of methods on BigWigFile for accessing the additional
information:
seqinfo(x):
Gets the Seqinfo object
indicating the lengths of the sequences for the intervals in the
file. No circularity or genome information is available.
summary(ranges = as(seqinfo(object), "GenomicRanges"), size
= 1L, type = c("mean", "min", "max", "coverage", "sd"),
defaultValue = NA_real_), ...: Aggregates the intervals in the file
that fall into ranges, which should be something
coercible to GRanges. The aggregation essentially
compresses each sequence to a length of size. The
algorithm is specified by type; available algorithms
include the mean, min, max, coverage (percent sequence covered
by at least one feature), and standard deviation. When a window
contains no features, defaultValue is assumed. The result
is an RleList, with an
element for each element in ranges. The
driving use case for this is visualization of coverage when the
screen space is small compared to the viewed portion of the
sequence. The operation is very fast, as it leverages cached
multi-level summaries present in every BigWig file.
If a summary statistic is not available / cannot be computed
for a given range a warning is thrown and the defaultValue
NA_real_ is returned.
BigWigFileList objects
A BigWigFileList() provides a convenient way of managing a list
of BigWigFile instances.
Author(s)
Michael Lawrence
See Also
wigToBigWig for converting a WIG file to BigWig.
Examples
if (.Platform$OS.type != "windows") {
test_path <- system.file("tests", package = "rtracklayer")
test_bw <- file.path(test_path, "test.bw")
## GRanges
## Returns ranges with non-zero scores.
gr <- import(test_bw)
gr
which <- GRanges(c("chr2", "chr2"), IRanges(c(1, 300), c(400, 1000)))
import(test_bw, which = which)
## RleList
## Scores returned as an RleList is equivalent to the coverage.
## Best option when 'which' or 'selection' contain many small ranges.
mini <- narrow(unlist(tile(which, 50)), 2)
rle <- import(test_bw, which = mini, as = "RleList")
rle
## NumericList
## The 'which' is stored as metadata:
track <- import(test_bw, which = which, as = "NumericList")
metadata(track)
## Not run:
test_bw_out <- file.path(tempdir(), "test_out.bw")
export(test, test_bw_out)
## End(Not run)
bwf <- BigWigFile(test_bw)
track <- import(bwf)
seqinfo(bwf)
summary(bwf) # for each sequence, average all values into one
summary(bwf, range(head(track))) # just average the first few features
summary(bwf, size = seqlengths(bwf) / 10) # 10X reduction
summary(bwf, type = "min") # min instead of mean
}
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(rtracklayer)
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/rtracklayer/BigWigFile.Rd_%03d_medium.png", width=480, height=480)
> ### Name: BigWigFile-class
> ### Title: BigWig Import and Export
> ### Aliases: class:BigWigFile BigWigFile-class class:BWFile BWFile-class
> ### class:BigWigFileList BigWigFileList-class BigWigFile BWFile
> ### BigWigFileList seqinfo,BigWigFile-method import.bw
> ### import.bw,ANY-method import,BigWigFile,ANY,ANY-method export.bw
> ### export.bw,ANY-method export,ANY,BigWigFile,ANY-method
> ### export,GenomicRanges,BigWigFile,ANY-method
> ### export,List,BigWigFile,ANY-method summary,BigWigFile-method
> ### path,BigWigFileList-method
> ### Keywords: methods classes
>
> ### ** Examples
>
> if (.Platform$OS.type != "windows") {
+ test_path <- system.file("tests", package = "rtracklayer")
+ test_bw <- file.path(test_path, "test.bw")
+
+ ## GRanges
+ ## Returns ranges with non-zero scores.
+ gr <- import(test_bw)
+ gr
+
+ which <- GRanges(c("chr2", "chr2"), IRanges(c(1, 300), c(400, 1000)))
+ import(test_bw, which = which)
+
+ ## RleList
+ ## Scores returned as an RleList is equivalent to the coverage.
+ ## Best option when 'which' or 'selection' contain many small ranges.
+ mini <- narrow(unlist(tile(which, 50)), 2)
+ rle <- import(test_bw, which = mini, as = "RleList")
+ rle
+
+ ## NumericList
+ ## The 'which' is stored as metadata:
+ track <- import(test_bw, which = which, as = "NumericList")
+ metadata(track)
+
+ ## Not run:
+ ##D test_bw_out <- file.path(tempdir(), "test_out.bw")
+ ##D export(test, test_bw_out)
+ ## End(Not run)
+
+ bwf <- BigWigFile(test_bw)
+ track <- import(bwf)
+
+ seqinfo(bwf)
+
+ summary(bwf) # for each sequence, average all values into one
+ summary(bwf, range(head(track))) # just average the first few features
+ summary(bwf, size = seqlengths(bwf) / 10) # 10X reduction
+ summary(bwf, type = "min") # min instead of mean
+ }
GRangesList object of length 2:
[[1]]
GRanges object with 1 range and 1 metadata column:
seqnames ranges strand | score
<Rle> <IRanges> <Rle> | <numeric>
[1] chr2 [1, 243199373] * | -1
[[2]]
GRanges object with 1 range and 1 metadata column:
seqnames ranges strand | score
[1] chr19 [1, 59128983] * | 0.25
-------
seqinfo: 2 sequences from an unspecified genome; no seqlengths
>
>
>
>
>
> dev.off()
null device
1
>