Last data update: 2014.03.03

R: Find most likely separation between positive and negative...
density1dR Documentation

Find most likely separation between positive and negative populations in 1D

Description

The function tries to find a reasonable split point between the two hypothetical cell populations "positive" and "negative". This function is considered internal, please use the API provided by rangeGate.

Usage

density1d(x, stain, alpha = "min", sd = 2, plot = FALSE, borderQuant =
0.1, absolute = TRUE, inBetween = FALSE, refLine=NULL,rare=FALSE,bwFac=1.2
,sig=NULL,peakNr=NULL, ...)

Arguments

x

A flowSet or flowFrame.

stain

A character scalar giving the flow parameter for which to compute the separation.

alpha

A tuning parameter that controls the location of the split point between the two populations. This has to be a numeric in the range [0,1], where values closer to 0 will shift the split point closer to the negative population and values closer to 1 will shift towards the positive population. Additionally, the value of alpha can be "min", in which case the split point will be selected as the area of lowest local density between the two populations.

sd

For the case where there is only a single population, the algorithm falls back to esitmating the mode of this population and a robust measure of the variance of it distribution. The sd tuning parameter controls how far away from the mode the split point is set.

plot

Create a plot of the results of the computation.

borderQuant

Usualy the instrument is set up in a way that the positive population is somewhere on the high end of the measurement range and the negative population is on the low end. This parameter allows to disregard populations with mean values in the extreme quantiles of the data range. It's value should be in the range [0,1].

absolute

Logical controling whether to classify a population (positive or negative) relative to the theoretical measurment range of the instrument or the actual range of the data. This can be set to TRUE if the alignment of the measurment range is not optimal and the bulk of the data is on one end of the theoretical range.

inBetween

Force the algorithm to put the separator in between two peaks. If there are more than two peaks, this argument is ignored.

refLine

Either NULL or a numeric of lenth 1. If NULL, this parameter is ignored. When it is set to a numeric, the minor sub-population (if any) below this reference line will be igored while determining the separator between positive and negative.

rare

Either TRUE or FALSE, assumes that there is one major peak, and that the rare positive population is to the right of it. Uses a robust estimate of mean and variance to gate the positive cells.

bwFac

The bandwidth for smoothing the density estimate. User-tunable

sig

a value of c(NULL,"L","R"),when sig is not NULL,use the half (left or right) of signal to estimate the std and mean.

peakNr

when peakNr is not NULL,drop the less significant peaks by their heights

...

Further arguments.

Details

The algorithm first tries to identify high density regions in the data. If the input is a flowSet, density regions will be computed on the collapsed data, hence it should have been normalized before (see warpSet for one possible normalization technique). The high density regions are then clasified as positive and negative populations, based on their mean value in the theoretical (or absolute if argument absolute=TRUE) measurement range. In case there are only two high-density regions the lower one is usually clasified as the negative populations, however the heuristics in the algorithm will force the classification towards a positive population if the mean value is already very high. The absolute and borderQuant arguments can be used to control this behaviour. The split point between populations will be drawn at the value of mimimum local density between the two populations, or, if the alpha argument is used, somewhere between the two populations where the value of alpha forces the point to be closer to the negative (0 - 0.5) or closer to the positive population (0.5 - 1).

If there is only a single high-density region, the algorithm will fall back to estimating the mode of the distribution (hubers) and a robust measure of it's variance and, in combination with the sd argument, set the split point somewhere in the right or left tail, depending on the classification of the region.

For more than two populations, the algorithm will still classify each population into positive and negative and compute the split point between those clusteres, similar to the two population case.

Value

A numeric indicating the split point between positive and negative populations.

Author(s)

Florian Hahne

See Also

warpSet, rangeGate

Examples


data(GvHD)
dat <- GvHD[pData(GvHD)$Patient==10]
dat <- transform(dat, "FL4-H"=asinh(`FL4-H`), "FL3-H"=asinh(`FL3-H`))
d <- flowStats:::density1d(dat, "FL4-H", plot=TRUE)
if(require(flowViz))
densityplot(~`FL4-H`, dat, refline=d)

## tweaking the location
flowStats:::density1d(dat, "FL4-H", plot=TRUE, alpha=0.8)

## only a single population
flowStats:::density1d(dat, "FL3-H", plot=TRUE)
flowStats:::density1d(dat, "FL3-H", plot=TRUE, sd=2)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(flowStats)
Loading required package: flowCore
Loading required package: fda
Loading required package: splines
Loading required package: Matrix

Attaching package: 'Matrix'

The following object is masked from 'package:flowCore':

    %&%


Attaching package: 'fda'

The following object is masked from 'package:graphics':

    matplot

Loading required package: mvoutlier
Loading required package: sgeostat
sROC 0.1-2 loaded
Loading required package: cluster
Loading required package: flowWorkspace
Loading required package: flowViz
Loading required package: lattice
Loading required package: ncdfFlow
Loading required package: RcppArmadillo
Loading required package: BH
Loading required package: gridExtra
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/flowStats/density1d.Rd_%03d_medium.png", width=480, height=480)
> ### Name: density1d
> ### Title: Find most likely separation between positive and negative
> ###   populations in 1D
> ### Aliases: density1d
> 
> ### ** Examples
> 
> 
> data(GvHD)
> dat <- GvHD[pData(GvHD)$Patient==10]
> dat <- transform(dat, "FL4-H"=asinh(`FL4-H`), "FL3-H"=asinh(`FL3-H`))
> d <- flowStats:::density1d(dat, "FL4-H", plot=TRUE)
> if(require(flowViz))
+ densityplot(~`FL4-H`, dat, refline=d)
> 
> ## tweaking the location
> flowStats:::density1d(dat, "FL4-H", plot=TRUE, alpha=0.8)
[1] 3.705714
> 
> ## only a single population
> flowStats:::density1d(dat, "FL3-H", plot=TRUE)
[1] 4.53325
> flowStats:::density1d(dat, "FL3-H", plot=TRUE, sd=2)
[1] 4.53325
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>