R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Compute summaries for cumulative subsets of a short-read data...

subsetSummary

R Documentation

Compute summaries for cumulative subsets of a short-read data set.

Description

THIS FUNCTION IS DEFUNCT!

Divides a short-read dataset into several subsets, and computes various summaries cumulatively. The goal is to study the characteristics of the data as a function of sample size.

Usage

subsetSummary(x, chr, nstep, props = seq(0.1, 1, 0.1),
              chromlens = seqlengths(x), fg.cutoff = 6, seqLen = 200,
              fdr.cutoff = 0.001, use.fdr = FALSE, resample = TRUE,
              islands = TRUE, verbose = getOption("verbose"))

Arguments

`x`	A `"GRanges"` object representing alignment locations at the sample level.
`chr`	The chromosome for which the summaries are to be obtained. Must specify a valid element of `x`
`nstep`	The number of maps in each increment for the full dataset (not per-chromosome). This will be translated to a per-chromosome number proportionally.
`props`	Alternatively, an increasing sequence of proportions determining the size of each subset. Overrides `nstep`.
`chromlens`	A named vector of per-chromosome lengths, typically the result of `seqlengths`.
`fg.cutoff`	The coverage depth above which a region would be considered foreground.
`seqLen`	The number of bases to which to extend each read before computing coverage.
`resample`	Logical; whether to randomly reorder the reads before dividing them up into subsets. Useful to remove potential order effects (for example, if data from two lanes were combined to produce `x`).
`fdr.cutoff`	The maximum false discovery rate for a region that is considered to be foreground.
`use.fdr`	Whether to use the FDR detected peaks when calling foreground and background.
`islands`	Logical. If `TRUE`, the whole island would be considered foreground if the maximum depth equals or exceeds `fg.cutoff`. If `FALSE`, only the region above the cutoff would be considered foreground.
`verbose`	logical controlling whether progress information will be shown during computation (which is potentially long-running).

Value

A data frame with various per-subset summaries.

Note

This function should be considered preliminary, in that it might change significantly or simply be removed in a subsequent version. If you like it the way it is, please notify the maintainer.

Author(s)

Deepayan Sarkar, Michael Lawrence