Last data update: 2014.03.03

R: The sparr Package: SPAtial Relative Risk
sparr-packageR Documentation

The sparr Package: SPAtial Relative Risk

Description

Provides functions to estimate fixed and adaptive kernel-smoothed relative risk surfaces via the density-ratio method and perform subsequent inference.

Details

Package: sparr
Version: 0.3-8
Date: 2016-03-11
License: GPL (>= 2)

Kernel smoothing, and the flexibility afforded by this methodology, provides an attractive approach to estimating complex probability density functions. This is particularly of interest when exploring problems in geographical epidemiology, the study of disease dispersion throughout some spatial region, given a population. The so-called ‘relative risk surface’, constructed as a ratio of estimated case to control densities (Bithell, 1990; 1991), describes the variation in the ‘risk’ of the disease, given the underlying at-risk population. This is a technique that has been applied successfully for mainly exploratory purposes in a number of different examples (see for example Sabel et al., 2000; Prince et al., 2001; Wheeler, 2007).

This package provides functions for bivariate kernel density estimation (KDE), implementing both fixed and ‘variable’ or ‘adaptive’ (Abramson, 1982) smoothing parameter options (see the function documentation for more information). A selection of bandwidth calculators for bivariate KDE and the relative risk function are provided, including one based on the maximal smoothing principle (Terrell, 1990), and others involving a leave-one-out least-squares cross-validation (see below). In addition, the ability to construct asymptotically derived p-value surfaces (‘tolerance’ contours of which signal statistically significant sub-regions of ‘extremity’ in a risk surface - Hazelton and Davies, 2009; Davies and Hazelton, 2010), as well as some flexible visualisation tools, are provided.

The content of sparr can be broken up as follows:

Datasets
PBC a case/control planar point pattern (ppp) concerning liver disease in northern England. Also available is the case/control dataset chorley of the spatstat package, which concerns the distribution of laryngeal cancer in an area of Lancashire, England.

Bandwidth calculators
OS estimation of an isotropic smoothing parameter for bivariate KDE, based on the oversmoothing principle introduced by Terrell (1990).
NS estimation of an isotropic smoothing parameter for bivariate KDE, based on the optimal value for a normal density (bivariate normal scale rule - see e.g. Wand and Jones, 1995).
LSCV.density a least-squares cross-validated (LSCV) estimate of an isotropic bandwidth for bivariate KDE (see e.g. Bowman and Azzalini, 1997).
LSCV.risk a least-squares cross-validated (LSCV) estimate of a jointly optimal, common isotropic case-control bandwidth for the kernel-smoothed risk function (see Kelsall and Diggle, 1995a;b and Hazelton, 2008).

Bivariate functions
KBivN bivariate normal (Gaussian) kernel
KBivQ bivariate quartic (biweight) kernel
bivariate.density kernel density estimate of bivariate data; fixed or adaptive smoothing

Relative risk and p-value surfaces
risk estimation of a (log) relative risk function
tolerance calculation of asymptotic p-value surface

Printing and summarising objects
S3 methods (print.bivden, print.rrs, summary.bivden and summary.rrs) are available for the bivariate density and risk function objects.

Visualisation
Most applications of the relative risk function in practice require plotting the relative risk within the study region (especially for an inspection of tolerance contours). To this end, sparr provides a number of different ways to achieve attractive and flexible visualisation. The user may produce a heat plot, a perspective plot, a contour plot, or an interactive 3D perspective plot (that the user can pan around and zoom - courtesy of the powerful rgl package; see below) for either an estimated relative risk function or a bivariate density estimate. These capabilities are available through S3 support of the plot function; see
plot.bivden for visualising a single bivariate density estimate from bivariate.density, and
plot.rrs for visualisation of an estimated relative risk function from risk.

Dependencies

The sparr package depends upon/imports some other important contributions to CRAN in order to operate; their uses here are indicated:

spatstat - Fast-fourier transform assistance with fixed and adaptive density estimation, as well as region handling; see Baddeley and Turner (2005).
rgl - Interactive 3D plotting of densities and surfaces; see Adler and Murdoch (2009).
MASS - Utility support for internal functions; see Venables and Ripley (2002).

Citation

To cite use of sparr in publications, the user may refer to the following work:
Davies, T.M., Hazelton, M.L. and Marshall, J.C. (2011), sparr: Analyzing spatial relative risk using fixed and adaptive kernel density estimation in R, Journal of Statistical Software 39(1), 1-14.

Author(s)

T.M. Davies
Dept. of Mathematics & Statistics, University of Otago, Dunedin, New Zealand;
M.L. Hazelton and J.C. Marshall
Institute of Fundamental Sciences - Statistics, Massey University, Palmerston North, New Zealand.

Maintainer: T.M.D. tdavies@maths.otago.ac.nz
Feedback welcomed.

References

Abramson, I. (1982), On bandwidth variation in kernel estimates — a square root law, Annals of Statistics, 10(4), 1217-1223.
Adler, D. and Murdoch, D. (2009), rgl: 3D visualization device system (OpenGL). R package version 0.87; URL: http://CRAN.R-project.org/package=rgl
Baddeley, A. and Turner, R. (2005), Spatstat: an R package for analyzing spatial point patterns, Journal of Statistical Software, 12(6), 1-42.
Bithell, J.F. (1990), An application of density estimation to geographical epidemiology, Statistics in Medicine, 9, 691-701.
Bithell, J.F. (1991), Estimation of relative risk function,. Statistics in Medicine, 10, 1745-1751.
Bowman, A.W. and Azzalini, A. (1997), Applied Smoothing Techniques for Data Analysis: The Kernel Approach with S-Plus Illustrations. Oxford University Press Inc., New York. ISBN 0-19-852396-3.
Davies, T.M. and Hazelton, M.L. (2010), Adaptive kernel estimation of spatial relative risk, Statistics in Medicine, 29(23) 2423-2437.
Davies, T.M., Jones, K. and Hazelton, M.L. (2015), Symmetric adaptive smoothing regimens for estimation of the spatial relative risk function, Submitted for publication.
Hazelton, M. L. (2008), Letter to the editor: Kernel estimation of risk surfaces without the need for edge correction, Statistics in Medicine, 27, 2269-2272.
Hazelton, M.L. and Davies, T.M. (2009), Inference based on kernel estimates of the relative risk function in geographical epidemiology, Biometrical Journal, 51(1), 98-109.
Kelsall, J.E. and Diggle, P.J. (1995a), Kernel estimation of relative risk, Bernoulli, 1, 3-16.
Kelsall, J.E. and Diggle, P.J. (1995b), Non-parametric estimation of spatial variation in relative risk, Statistics in Medicine, 14, 2335-2342.
Prince, M. I., Chetwynd, A., Diggle, P. J., Jarner, M., Metcalf, J. V. and James, O. F. W. (2001), The geographical distribution of primary biliary cirrhosis in a well-defined cohort, Hepatology 34, 1083-1088.
Sabel, C. E., Gatrell, A. C., Loytonenc, M., Maasiltad, P. and Jokelainene, M. (2000), Modelling exposure opportunitites: estimating relative risk for motor disease in Finland, Social Science & Medicine 50, 1121-1137.
Terrell, G.R. (1990), The maximal smoothing principle in density estimation, Journal of the American Statistical Association, 85, 470-477.
Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S, Fourth Edition, Springer, New York.
Wand, M.P. and Jones, C.M., 1995. Kernel Smoothing, Chapman & Hall, London.
Wheeler, D. C. (2007), A comparison of spatial clustering and cluster detection techniques for childhood leukemia incidence in Ohio, 1996-2003, International Journal of Health Geographics, 6(13).

Results