Last data update: 2014.03.03

R: Leave-one-out least-squares cross-validation (LSCV) for...
LSCV.densityR Documentation

Leave-one-out least-squares cross-validation (LSCV) for bivariate KDE bandwidths

Description

Provides an isotropic LSCV bandwidth estimate for use in 2-dimensional kernel density estimation (see e.g. Bowman and Azzalini, 1997).

Usage

LSCV.density(data, hlim = NULL, res = 128, edge = TRUE,
	 WIN = NULL, quick = TRUE, comment = TRUE)

Arguments

data

An object of type data.frame, list, matrix, or ppp describing the observed data from which we wish to calculate the LSCV bandwidth. See ‘Details’ for further information.

hlim

A numeric vector of length 2 giving the interval over which to search for the bandwidth that minimises the selection criterion. If NULL (default), the function attempts to automatically select an appropriate range based on multiples of Stoyan and Stoyan's (1994) rule-of-thumb. The user is strongly recommended to supply their own hlim.

res

Single integer giving the square grid resolution over which evaluation of the selection criterion takes place. Defaults to a 128 by 128 grid.

edge

Boolean. Whether or not to employ edge-correction in the calculations. Defaults to TRUE.

WIN

A polygonal owin object giving the study region. Ignored if data is already a ppp.object.

quick

Intended for advanced use; users are recommended not to change the default TRUE. Setting quick = FALSE forces the function to individually evaluate the CV objective function at each of seq(hlim[1], hlim[2], length = 50) bandwidths, returning the corresponding values. Can be useful for diagnostic purposes.

comment

Boolean. Whether or not to print function progress during execution. Defaults to TRUE.

Details

This function calculates a LSCV smoothing bandwidth for kernel density estimates of 2-dimensional (bivariate) data. If the data argument is a data.frame or a matrix, this must have exactly two columns containing the x ([,1]) and y ([,2]) data values. Should data be a list, this must have two vector components of equal length named x and y. Alternatively, data may be an object of class ppp (see ppp.object).

Value

A single numeric value of the estimated bandwidth (if quick = FALSE, this value is named hopt; additionally returned are the objective function values (lscv) and the index of the minimum value (ind)). The user may need to experiment with adjusting hlim to find a suitable minimum.

Warning

Leave-one-out LSCV for bandwidth selection in kernel density estimation is notoriously unstable in practice and has a tendency to produce rather small bandwidths. Satisfactory bandwidths are not guaranteed for every application. This method can also be computationally expensive for large data sets and fine evaluation grid resolutions.

Author(s)

T.M. Davies

References

Bowman, A.W. and Azzalini, A. (1997), Applied Smoothing Techniques for Data Analysis: The Kernel Approach with S-Plus Illustrations. Oxford University Press Inc., New York. ISBN 0-19-852396-3.

Stoyan, D. and Stoyan, H. (1994), Fractals, Random Shapes and Point Fields. Wiley, Great Britain. ISBN 0-471-93757-6.

See Also

spatstat's function bw.relrisk

Examples

## Not run: 
data(PBC)

##PBC cases
LSCV.density(split(PBC)[[1]],hlim=c(10,400))

##PBC controls
LSCV.density(split(PBC)[[2]],hlim=c(10,400))

## End(Not run)

Results