R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Fast mean calculations in non-overlapping bins

binMeans

R Documentation

Fast mean calculations in non-overlapping bins

Description

Computes the sample means in non-overlapping bins

Usage

binMeans(y, x, idxs=NULL, bx, na.rm=TRUE, count=TRUE, right=FALSE, ...)

Arguments

`y`	A `numeric` `vector` of K values to calculate means on.
`x`	A `numeric` `vector` of K positions for to be binned.
`idxs`	A `vector` indicating subset of elements to operate over. If `NULL`, no subsetting is done.
`bx`	A `numeric` `vector` of B+1 ordered positions specifying the B > 0 bins `[bx[1],bx[2])`, `[bx[2],bx[3])`, ..., `[bx[B],bx[B+1])`.
`na.rm`	If `TRUE`, missing values in `y` are dropped before calculating the mean, otherwise not.
`count`	If `TRUE`, the number of data points in each bins is returned as attribute `count`, which is an `integer` `vector` of length B.
`right`	If `TRUE`, the bins are right-closed (left open), otherwise left-closed (right open).
`...`	Not used.

Details

binMeans(x, bx, right=TRUE) gives equivalent results as rev(binMeans(-x, bx=sort(-bx), right=FALSE)), but is faster.

Value

Returns a numeric vector of length B.

Missing and non-finite values

Data points where either of y and x is missing are dropped (and therefore are also not counted). Non-finite values in y are not allowed and gives an error. Missing values in bx are not allowed and gives an error.

Empty bins

Empty bins will get value NaN.

Author(s)

Henrik Bengtsson with initial code contributions by Martin Morgan [1].

References

[1] R-devel thread Fastest non-overlapping binning mean function out there? on Oct 3, 2012

Examples

x <- 1:200
mu <- double(length(x))
mu[1:50] <- 5
mu[101:150] <- -5
y <- mu + rnorm(length(x))

# Binning
bx <- c(0,50,100,150,200)+0.5
yS <- binMeans(y, x=x, bx=bx)

plot(x,y)
for (kk in seq(along=yS)) {
  lines(bx[c(kk,kk+1)], yS[c(kk,kk)], col="blue", lwd=2)
}