integer; if larger than 1, then apply n-fold cross validation;
if nfold equals nrow(data) (the default), apply leave-one-out cross
validation; if set to e.g. 5, five-fold cross validation is done. To specify the
folds, pass an integer vector of length nrow(data) with fold indexes.
remove.all
logical; if TRUE, remove observations at cross validation
locations not only for the first, but for all subsequent variables as well
verbose
logical; if FALSE, progress bar is suppressed
all.residuals
logical; if TRUE, residuals for all variables are
returned instead of for the first variable only
...
other arguments that will be passed to predict
in case of gstat.cv, or to gstat in case of krige.cv
formula
formula that defines the dependent variable as a linear
model of independent variables; suppose the dependent variable has name
z, for ordinary and simple kriging use the formula z~1;
for simple kriging also define beta (see below); for universal
kriging, suppose z is linearly dependent on x and y,
use the formula z~x+y
locations
formula with only independent variables that define the
spatial data locations (coordinates), e.g. ~x+y, OR data object
deriving from class Spatial, which has a
coordinates method to extract its coordinates.
data
data frame; should contain the dependent variable, independent
variables, and coordinates; only to be provided if locations is a formula
model
variogram model of dependent variable (or its residuals),
defined by a call to vgm or fit.variogram
beta
only for simple kriging (and simulation based on simple
kriging); vector with the trend coefficients (including intercept);
if no independent variables are defined the model only contains an
intercept and this should be the simple kriging mean
nmax
for local kriging: the number of nearest observations that
should be used for a kriging prediction or simulation, where nearest
is defined in terms of the space of the spatial locations. By default,
all observations are used
nmin
for local kriging: if the number of nearest observations
within distance maxdist is less than nmin, a missing
value will be generated; see maxdist
maxdist
for local kriging: only observations within a distance
of maxdist from the prediction location are used for prediction
or simulation; if combined with nmax, both criteria apply
debug.level
print debugging information; 0 suppresses
debug information
Details
Leave-one-out cross validation (LOOCV) visits a data point, and
predicts the value at that location by leaving out the observed value,
and proceeds with the next data point. (The observed value is left
out because kriging would otherwise predict the value itself.) N-fold
cross validation makes a partitions the data set in N parts. For all
observation in a part, predictions are made based on the remaining N-1
parts; this is repeated for each of the N parts. N-fold cross validation
may be faster than LOOCV.
Value
data frame containing the coordinates of data or those
of the first variable in object, and columns of prediction and
prediction variance of cross validated data points, observed values,
residuals, zscore (residual divided by kriging standard error), and fold.
If all.residuals is true, a data frame with residuals for all
variables is returned, without coordinates.
Methods
formula = "formula", locations = "formula"
locations specifies which coordinates in data refer to spatial coordinates
formula = "formula", locations = "Spatial"
Object locations knows about its own spatial locations
Note
Leave-one-out cross validation seems to be much faster in plain
(stand-alone) gstat, apparently quite a bit of the effort is spent moving
data around from R to gstat.