This is a summary of the features and functionality of
georob, a package in R for robust geostatistical analyses.
Details
georob is a package for robust analyses of geostatistical data.
Such data, say y(s_i), are
recorded at a set of locations,
s_i, i=1,2, …, n, in a
domain G in R^d, d in (1,2,3), along
with covariate information
x_j(s_i), j=1,2,
…, p.
Model
We use the following model for the data
y_i=y(s_i):
Y(s_i) = Z(s_i) + ε = x(s_i)^T β + B(s_i) + ε_i,
where
Z(s_i) = x(s_i)^T β + B(s_i) is the so-called signal,
x(s_i)^Tβ
is the external drift,
{B(s)} is an unobserved stationary or
intrinsic spatial Gaussian random field with zero mean, and
ε_i is an
i.i.d error from a possibly long-tailed distribution with scale parameter
τ (τ^2 is usually called nugget effect).
In vector form the model is written as
Y = X β + B + ε,
where X is the model matrix with the
rows
x^T(s_i).
The (generalized) covariance matrix of the vector of
spatial Gaussian random effects
B
is denoted by
where
σ_n^2
is the variance of seemingly uncorrelated micro-scale variation in
B(s)
that cannot be resolved with the chosen sampling design,
σ^2 is the variance of the captured auto-correlated variation in
B(s),
σ_Z^2=σ_n^2+σ^2
is the signal variance, and
ξ=σ_n^2/σ^_Z^2.
To estimate both
σ_n^2 and τ^2 (and not only their sum), one needs
replicated measurements for some of the
s_i.
We define
V_{α}
to be the matrix with elements
(V_{α})_ij = γ_0 - γ(|A (s_i - s_j )|),
where the constant γ_0 is chosen large enough so that
V_{α}
is positive definite,
v() is a valid stationary or intrinsic variogram, and
A = A(α, f_1, f_2; ω, φ, ζ)
is a matrix that is used to model geometrically anisotropic auto-correlation.
In more detail,
A
maps an arbitrary point on an ellipsoidal surface with constant semivariance in
R^d,
centred on the origin, and having lengths of semi-principal axes,
p_1,
p_2,
p_3,
equal to
|p_1|=α,
|p_2|=f_1 α and
|p_3|=f_2 α,
0 < f_2 <= f_1 <= 1,
respectively, onto the surface of the unit ball centred on the origin.
The orientation of the ellipsoid is defined by the three angles
ω, φ and ζ:
ω
is the azimuth of p_1
(= angle between north and the projection
of
p_1
onto the x-y-plane,
measured from north to south positive clockwise in degrees),
φ
is 90 degrees minus the altitude of
p_1
(= angle between the zenith and
p_1,
measured from zenith to nadir positive clockwise in degrees), and
ζ
is the angle between
p_2
and the direction of the line, say y',
defined by the intersection between the
x-y-plane and the plane orthogonal to
p_1 running through the origin
(ζ is measured from y' positive counterclockwise in degrees).
To model geometrically anisotropic variograms in
R^2
one has to set φ=90 and f_2 = 1,
and for f_1 = f_2 = 1
one obtains the model for isotropic auto-correlation
with range parameter α.
Note that for isotropic auto-correlation the software processes data for
which d may exceed 3.
Two remarks are in order:
Clearly, the (generalized) covariance matrix of the observations
Y is given by
Cov[Y, Y^T] = τ^2 I + Γ_θ.
Depending on the context, the term “variogram parameters”
denotes sometimes all parameters of a geometrically anisotropic
variogram model, but in places only the parameters of an isotropic
variogram model, i.e. σ^2, …, α, … and
f_1, …, ζ are denoted by the term “anisotropy
parameters”.
Estimation
The unobserved spatial random effects
B at the data locations
s_i
and the model parameters
β and
θ^T =
(σ^2, σ^2_n, τ^2, α, f_1, f_2,
ω, φ, ζ, …)
are unknown and are estimated in georob either by Gaussian or
robust restricted maximum likelihood (REML) or
Gaussian maximum likelihood (ML). Here ...
denote further parameters of the variogram such as the smoothness parameter
of the Whittle-MatŽrn model.
In brief, the robust REML method is based on the insight that for
given θ the
kriging predictions (= BLUP) of
B and the generalized least
squares (GLS = ML) estimates of
β can be obtained
simultaneously by maximizing
The respective estimating equations can then by robustified by
replacing the standardized residuals, say
ε_i/τ,
by a bounded or redescending ψ-function,
ψ_c(ε_i/τ),
of them (e.g. Marona et al, 2006, chap. 2) and by
introducting suitable bias correction terms for Fisher consistency at
the Gaussian model,
see KŸnsch et al. (2011) for details.
The robustified estimating equations
are solved numerically by a combination of iterated re-weighted least squares
(IRWLS) to estimate B and
β for given
θ
and nonlinear root finding by the function
nleqslv of the R package nleqslv
to get θ.
The robustness of the procedure is controlled by the tuning parameter c
of the ψ_c-function. For c>=1000 the algorithm computes
Gaussian (RE)ML estimates and customary plug-in kriging predictions.
Instead of solving the Gaussian (RE)ML estimating equations, our software then
maximizes the Gaussian (restricted) loglikelihood using nlminb or
optim.
georob uses variogram models implemented in the R package
RandomFields (see RMmodel).
Currently, estimation of the parameters of the following models is
implemented:
For most variogram parameters, closed-form
expressions of dγ/dθ_i are used in the computations. However,
for the parameter ν of the models "RMbessel",
"RMmatern" and "RMwhittle"dγ/dν is evaluated numerically by the function
numericDeriv, and this results in an increase in
computing time when ν is estimated.
Prediction
Robust plug-in external-drift point kriging predictions
can be computed for an unsampled location
s_0
from the covariates
x(s_0),
the estimated parameters
hatβ,
hatθ
and the predicted random effects
hatB
by
where
Γ_hatθ
is the estimated (generalized) covariance matrix of
B and
γ^T_hatθ(s_0)
is the vector with the estimated (generalized) covariances between
B and
B(s_0).
Kriging variances can be computed as well, based on approximated covariances of
hatB and
hatβ
(see KŸnsch et al., 2011, and appendices of
Nussbaum et al., 2012, for details).
The package georob provides in addition software for computing
robust external-drift block kriging predictions. The required
integrals of the generalized covariance function are computed by
functions of the R package constrainedKriging.
Functionality
For the time being, the functionality of georob is limited
to robust geostatistical analyses of single response variables.
No software is currently available for robust multivariate geostatistical
analyses.
georob offers functions for:
Robustly fitting a spatial linear model to data that are
possibly contaminated by independent errors from a long-tailed
distribution by robust REML (see functions georob — which also
fits such models efficiently by Gaussian (RE)ML — profilelogLik and
control.georob).
Extracting estimated model components (see residuals.georob,
rstandard.georob, rstudent.georob,
ranef.georob).
Robustly estimating sample variograms and for fitting
variogram model functions to them (see
sample.variogram and fit.variogram.model).
Model building by forward and backward selection of
covariates for the external drift
(see waldtest.georob, step.georob,
add1.georob, drop1.georob,
extractAIC.georob, logLik.georob,
deviance.georob).
For a robust fit, the loglikelihood is not defined. The function then
computes the (restricted) loglikelihood of an equivalent Gaussian model with
heteroscedastic nugget (see deviance.georob for details).
Assessing the goodness-of-fit and predictive power of the model by
K-fold cross-validation (see cv.georob and
validate.predictions).
Computing robust external-drift point and block kriging
predictions (see predict.georob, control.predict.georob).
Unbiased back-transformation of both point and block
kriging predictions of log-transformed data to the original scale of
the measurements (see lgnpp).
Nussbaum, M., Papritz, A., Baltensweiler, A. and Walthert, L. (2012)
Organic Carbon Stocks of Swiss Forest Soils,
Institute of Terrestrial Ecosystems, ETH Zurich and
Swiss Federal Institute for Forest, Snow and Landscape Research
(WSL), pp. 51.
http://e-collection.library.ethz.ch/eserv/eth:6027/eth-6027-01.pdf
KŸnsch, H. R., Papritz, A., Schwierz, C. and Stahel, W. A. (in
preparation) Robust Geostatistics.
KŸnsch, H. R., Papritz, A., Schwierz, C. and Stahel, W. A. (2011) Robust
estimation of the external drift and the variogram of spatial data.
Proceedings of the ISI 58th World Statistics Congress of the International
Statistical Institute.
http://e-collection.library.ethz.ch/eserv/eth:7080/eth-7080-01.pdf
Maronna, R. A., Martin, R. D. and Yohai, V. J. (2006) Robust Statistics Theory and
Methods, John Wiley & Sons.
See Also
georob for (robust) fitting of spatial linear models;
georobObject for a description of the class georob;
plot.georob for display of RE(ML) variogram estimates;
control.georob for controlling the behaviour of georob;
cv.georob for assessing the goodness of a fit by georob;
predict.georob for computing robust kriging predictions; and finally
georobModelBuilding for stepwise building models of class georob;
georobMethods for further methods for the class georob,
sample.variogram and fit.variogram.model
for robust estimation and modelling of sample variograms.