Last data update: 2014.03.03

R: Segment a vector into positive, zero, and negative regions
getSegmentsR Documentation

Segment a vector into positive, zero, and negative regions

Description

Given two cutoffs, L and U, this function divides a numerical vector into contiguous parts that are above U, between L and U, and below L.

Usage

getSegments(x, f = NULL, cutoff = quantile(abs(x), 0.99),
            assumeSorted = FALSE, verbose = FALSE)

Arguments

x

A numeric vector.

f

A factor used to pre-divide x into pieces. Each piece is then segmented based on the cutoff. Setting this to NULL says that there is no pre-division. Often, clusterMaker is used to define this factor.

cutoff

a numeric vector of length either 1 or 2. If length is 1, U (see details) will be cutoff and L will be -cutoff. Otherwise it specifies L and U. The function will furthermore always use the minimum of cutoff for L and the maximum for U.

assumeSorted

This is a statement that the function may assume that the vector f is sorted. Allowing the function to make this assumption may increase the speed of the function slightly.

verbose

Should the function be verbose?

Details

This function is used to find the indexes of the ‘bumps’ in functions such as bumphunter.

x is a numeric vector, which is converted into three levels depending on whether x>=U (‘up’), L<x<U (‘zero’) or x<=L (‘down’), with L and U coming from cutoff. We assume that adjacent entries in x are next to each other in some sense. Segments, consisting of consecutive indices into x (ie. values between 1 and length(x)), are formed such that all indices in the same segment belong to the same level of f and have the same discretized value of x.

In other words, we can use getSegments to find runs of x belonging to the same level of f and with all of the values of x either above U, between L and U, or below L.

Value

A list with three components, each a list of indices. Each component of these lists represents a segment and this segment is represented by a vector of indices into the original vectors x and f.

upIndex:

a list with each entry an index of contiguous values in the same segment. These segments have values of x above U.

dnIndex:

a list with each entry an index of contiguous values in the same segment. These segments have values of x below L.

zeroIndex:

a list with each entry an index of contiguous values in the same segment. These segments have values of x between L and U.

Author(s)

Rafael A Irizarry and Kasper Daniel Hansen

See Also

clusterMaker

Examples

x <- 1:100
y <- sin(8*pi*x/100)
chr <- rep(1, length(x))
indexes <- getSegments(y, chr, cutoff=0.8)
plot(x, y, type="n")
for(i in 1:3){
   ind <- indexes[[i]]
   for(j in seq(along=ind)) {
      k <- ind[[j]]
      text(x[k], y[k], j, col=i)
   }
}
abline(h=c(-0.8,0.8))

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(bumphunter)
Loading required package: S4Vectors
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit


Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: foreach
Loading required package: iterators
Loading required package: locfit
locfit 1.5-9.1 	 2013-03-22
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/bumphunter/getSegments.Rd_%03d_medium.png", width=480, height=480)
> ### Name: getSegments
> ### Title: Segment a vector into positive, zero, and negative regions
> ### Aliases: getSegments
> 
> ### ** Examples
> 
> x <- 1:100
> y <- sin(8*pi*x/100)
> chr <- rep(1, length(x))
> indexes <- getSegments(y, chr, cutoff=0.8)
> plot(x, y, type="n")
> for(i in 1:3){
+    ind <- indexes[[i]]
+    for(j in seq(along=ind)) {
+       k <- ind[[j]]
+       text(x[k], y[k], j, col=i)
+    }
+ }
> abline(h=c(-0.8,0.8))
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>