Provides functions to estimate fixed and adaptive kernel-smoothed relative risk surfaces via the density-ratio method and perform subsequent inference.
Details
Package:
sparr
Version:
0.3-8
Date:
2016-03-11
License:
GPL (>= 2)
Kernel smoothing, and the flexibility afforded by this methodology, provides an attractive approach to estimating complex probability density functions. This is particularly of interest when exploring problems in geographical epidemiology, the study of disease dispersion throughout some spatial region, given a population. The so-called ‘relative risk surface’, constructed as a ratio of estimated case to control densities (Bithell, 1990; 1991), describes the variation in the ‘risk’ of the disease, given the underlying at-risk population. This is a technique that has been applied successfully for mainly exploratory purposes in a number of different examples (see for example Sabel et al., 2000; Prince et al., 2001; Wheeler, 2007).
This package provides functions for bivariate kernel density estimation (KDE), implementing both fixed and ‘variable’ or ‘adaptive’ (Abramson, 1982) smoothing parameter options (see the function documentation for more information). A selection of bandwidth calculators for bivariate KDE and the relative risk function are provided, including one based on the maximal smoothing principle (Terrell, 1990), and others involving a leave-one-out least-squares cross-validation (see below). In addition, the ability to construct asymptotically derived p-value surfaces (‘tolerance’ contours of which signal statistically significant sub-regions of ‘extremity’ in a risk surface - Hazelton and Davies, 2009; Davies and Hazelton, 2010), as well as some flexible visualisation tools, are provided.
The content of sparr can be broken up as follows:
Datasets PBC a case/control planar point pattern (ppp) concerning liver disease in northern England. Also available is the case/control dataset chorley of the spatstat package, which concerns the distribution of laryngeal cancer in an area of Lancashire, England.
Bandwidth calculators OS estimation of an isotropic smoothing parameter for bivariate KDE, based on the oversmoothing principle introduced by Terrell (1990). NS estimation of an isotropic smoothing parameter for bivariate KDE, based on the optimal value for a normal density (bivariate normal scale rule - see e.g. Wand and Jones, 1995). LSCV.density a least-squares cross-validated (LSCV) estimate of an isotropic bandwidth for bivariate KDE (see e.g. Bowman and Azzalini, 1997). LSCV.risk a least-squares cross-validated (LSCV) estimate of a jointly optimal, common isotropic case-control bandwidth for the kernel-smoothed risk function (see Kelsall and Diggle, 1995a;b and Hazelton, 2008).
Bivariate functions KBivN bivariate normal (Gaussian) kernel KBivQ bivariate quartic (biweight) kernel bivariate.density kernel density estimate of bivariate data; fixed or adaptive smoothing
Relative risk and p-value surfaces risk estimation of a (log) relative risk function tolerance calculation of asymptotic p-value surface
Printing and summarising objects S3 methods (print.bivden, print.rrs, summary.bivden and summary.rrs) are available for the bivariate density and risk function objects.
Visualisation
Most applications of the relative risk function in practice require plotting the relative risk within the study region (especially for an inspection of tolerance contours). To this end, sparr provides a number of different ways to achieve attractive and flexible visualisation. The user may produce a heat plot, a perspective plot, a contour plot, or an interactive 3D perspective plot (that the user can pan around and zoom - courtesy of the powerful rgl package; see below) for either an estimated relative risk function or a bivariate density estimate. These capabilities are available through S3 support of the plot function; see plot.bivden for visualising a single bivariate density estimate from bivariate.density, and plot.rrs for visualisation of an estimated relative risk function from risk.
Dependencies
The sparr package depends upon/imports some other important contributions to CRAN in order to operate; their uses here are indicated:
spatstat - Fast-fourier transform assistance with fixed and adaptive density estimation, as well as region handling; see Baddeley and Turner (2005). rgl - Interactive 3D plotting of densities and surfaces; see Adler and Murdoch (2009). MASS - Utility support for internal functions; see Venables and Ripley (2002).
Citation
To cite use of sparr in publications, the user may refer to the following work:
Davies, T.M., Hazelton, M.L. and Marshall, J.C. (2011), sparr: Analyzing spatial relative risk using fixed and adaptive kernel density estimation in R, Journal of Statistical Software39(1), 1-14.
Author(s)
T.M. Davies
Dept. of Mathematics & Statistics, University of Otago, Dunedin, New Zealand;
M.L. Hazelton and J.C. Marshall
Institute of Fundamental Sciences - Statistics, Massey University, Palmerston North, New Zealand.
Abramson, I. (1982), On bandwidth variation in kernel estimates — a square root law, Annals of Statistics, 10(4), 1217-1223.
Adler, D. and Murdoch, D. (2009), rgl: 3D visualization device system (OpenGL). R package version 0.87; URL: http://CRAN.R-project.org/package=rgl
Baddeley, A. and Turner, R. (2005), Spatstat: an R package for analyzing spatial point patterns, Journal of Statistical Software, 12(6), 1-42.
Bithell, J.F. (1990), An application of density estimation to geographical epidemiology, Statistics in Medicine, 9, 691-701.
Bithell, J.F. (1991), Estimation of relative risk function,. Statistics in Medicine, 10, 1745-1751.
Bowman, A.W. and Azzalini, A. (1997), Applied Smoothing Techniques for Data Analysis: The Kernel Approach with S-Plus Illustrations. Oxford University Press Inc., New York. ISBN 0-19-852396-3.
Davies, T.M. and Hazelton, M.L. (2010), Adaptive kernel estimation of spatial relative risk, Statistics in Medicine, 29(23) 2423-2437.
Davies, T.M., Jones, K. and Hazelton, M.L. (2015), Symmetric adaptive smoothing regimens for estimation of the spatial relative risk function, Submitted for publication.
Hazelton, M. L. (2008), Letter to the editor: Kernel estimation of risk surfaces without the need for edge correction, Statistics in Medicine, 27, 2269-2272.
Hazelton, M.L. and Davies, T.M. (2009), Inference based on kernel estimates of the relative risk function in geographical epidemiology, Biometrical Journal, 51(1), 98-109.
Kelsall, J.E. and Diggle, P.J. (1995a), Kernel estimation of relative risk, Bernoulli, 1, 3-16.
Kelsall, J.E. and Diggle, P.J. (1995b), Non-parametric estimation of spatial variation in relative risk, Statistics in Medicine, 14, 2335-2342.
Prince, M. I., Chetwynd, A., Diggle, P. J., Jarner, M., Metcalf, J. V. and James, O. F. W. (2001), The geographical distribution of primary biliary cirrhosis in a well-defined cohort, Hepatology34, 1083-1088.
Sabel, C. E., Gatrell, A. C., Loytonenc, M., Maasiltad, P. and Jokelainene, M. (2000), Modelling exposure opportunitites: estimating relative risk for motor disease in Finland, Social Science & Medicine50, 1121-1137.
Terrell, G.R. (1990), The maximal smoothing principle in density estimation, Journal of the American Statistical Association, 85, 470-477.
Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S, Fourth Edition, Springer, New York.
Wand, M.P. and Jones, C.M., 1995. Kernel Smoothing, Chapman & Hall, London.
Wheeler, D. C. (2007), A comparison of spatial clustering and cluster detection techniques for childhood leukemia incidence in Ohio, 1996-2003, International Journal of Health Geographics, 6(13).