R: Summary Statistics of (Cross-)Validation Prediction Errors
validate.predictions
R Documentation
Summary Statistics of (Cross-)Validation Prediction Errors
Description
Functions to compute and plot summary statistics of prediction errors to
(cross-)validate fitted spatial linear models by the criteria proposed by
Gneiting et al. (2007) for assessing probabilistic forecasts.
Usage
validate.predictions(data, pred, se.pred,
statistic = c("crps", "pit", "mc", "bs", "st"), ncutoff = NULL)
## S3 method for class 'cv.georob'
plot(x,
type = c("sc", "lgn.sc", "ta", "qq", "hist.pit", "ecdf.pit", "mc", "bs"),
smooth = TRUE, span = 2/3, ncutoff = NULL, add = FALSE,
col, pch, lty, main, xlab, ylab, ...)
## S3 method for class 'cv.georob'
print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'cv.georob'
rstudent(model, ...)
## S3 method for class 'cv.georob'
summary(object, se = FALSE, ...)
Arguments
data
a numeric vector with observations about a response
(mandatory argument).
pred
a numeric vector with predictions for the response (mandatory
argument).
se.pred
a numeric vector with prediction standard errors
(mandatory argument).
statistic
character keyword defining what statistic of the
prediction errors should be computed.
Possible values are (see Details):
"crps": continuous ranked probability score (default),
"pit": probability integral transform,
"mc": average predictive distribution (marginal
calibration),
"bs": Brier score,
"st": mean and dispersion statistics of (standardized)
prediction errors.
ncutoff
positive integer (N) giving the number of quantiles,
for which CDFs are evaluated (type = "mc"), or the number of
thresholds for which the Brier score is computed (type = "bs"),
see Details (default: min(500, length(data))).
x, model, object
objects of class cv.georob.
digits
positive integer indicating the number of decimal digits to print.
type
character keyword defining what type of plot is created by the
plot.cv.georob. Possible values are:
"sc": a scatterplot of the (possibly log-transformed) response
vs. the respective predictions (default).
"lgn.sc": a scatterplot of the untransformed response
against back- transformed predictions of the log-transformed response.
"ta": Tukey-Anscombe plot (plot of standardized prediction
errors vs. predictions).
"qq": normal QQ plot of standardized prediction errors.
"hist.pit": histogram of probability integral transform, see
Details.
"ecdf.pit": empirical cdf of probability integral
transform, see Details.
"mc": a marginal calibration plot, see Details,
"bs": a plot of Brier score vs. threshold, see
Details.
smooth
control whether scatter plots of data vs. predictions
should be smoothed by loess.smooth.
span
smoothness parameter for loess (see loess.smooth).
add
logical controlling wether the current high-level plot is
added to an existing graphics without cleaning the frame before (default:
FALSE).
main, xlab, ylab
title and axes labels of plot.
col, pch, lty
color, symbol and line type.
se
logical controlling if the standard errors of the averaged
continuous ranked probability score and of the mean and dispersion
statistics of the prediction errors (see Details) are computed
from the respective values computed for the K cross-validation
subsets (default: FALSE).
...
additional arguments passed to the methods.
Details
validate.predictions computes the items required to evaluate (and
plot) the diagnostic criteria proposed by Gneiting et al. (2007) for
assessing the calibration and the sharpness of
probabilistic predictions. To this aim, validate.predictions uses
the assumption that the prediction errors
Y(s)-hatY(s)
follow normal distributions with zero mean and standard deviations equal
to the kriging standard errors. This assumption is used to compute
the probability integral transform (PIT),
PIT_i = F_i(y_i),
where F_i(y_i) denotes the predictive CDF evaluted at y_i,
the value of the ith (cross-)validation datum,
the average predictive CDF
barF_n(y)=1/n ∑_{i=1}^n F_i(y),
where n is the number of (cross-)validation observations and the
F_i are evaluated at N quantiles equal to the set of
distinct y_i (or a subset of size N of them),
the Brier Score
BS(y) = 1/n ∑_{i=1}^n (F_i(y) - I(y_i ≤q y) )^2,
where I(x) is the indicator function for the event x, and
the Brier score is again evaluated at the unique values of the (cross-)validation
observations (or a subset of size N of them),
the averaged continuous ranked probability score, CRPS, a
strictly proper scoring criterion to rank predictions, which is related
to the Brier score by
CRPS = int_{-∞}^∞ BS(y) dy.
Gneiting et al. (2007) proposed the following plots to validate
probabilistic predictions:
A histogram of the PIT values. For ideal predictions, with
observed coverages of prediction intervals matching nominal
coverages, the PIT values have an uniform distribution.
Plots of barF_n(y) and of the empirical CDF of
the data, say hat{G}_n(y), and of their
difference, barF_n(y)-hat{G}_n(y)
vs y. The forecasts are said to be marginally calibrated
if barF_n(y) and hat{G}_n(y)
match.
A plot of BS(y) vs. y. Probabilistic
predictions are said to be sharp if the area under this curve,
which equals CRPS, is minimized.
The plot method for class cv.georob allows to create
these plots, along with scatterplots of observations and predictions,
Tukey-Anscombe and normal QQ plots of the standardized prediction
errors.
summary.cv.georob computes the mean and dispersion statistics
of the (standardized) prediction errors (by a call to
validate.prediction with argument statistic = "st", see
Value) and the averaged continuous ranked probability score
(crps). If present in the cv.georob object, the error
statistics are also computed for the errors of the unbiasedly
back-transformed predictions of a log-transformed response. If se
is TRUE then these statistics are evaluated separately for the
K cross-validation subsets and the standard errors of the means of
these statistics are returned as well.
The print method for class cv.georob returns the mean
and dispersion statistics of the (standardized) prediction errors.
The method rstudent returns for class cv.georob the
standardized prediction errors.
Value
Depending on the argument statistic, the function
validate.predictions returns
a numeric vector of PIT values if statistic is equal to "pit",
a named numeric vector with summary statistics of the
(standardized) prediction errors if statistic is equal to "st". The
following statistics are computed:
me
mean prediction error
mede
median prediction error
rmse
root mean squared prediction error
made
median absolute prediction error
qne
Qn dispersion measure of prediction errors
(see Qn)
msse
mean squared standardized prediction error
medsse
median squared standardized prediction error
a data frame if statistic is equal to "mc" or
"bs" with the components (see Details):
z
the sorted unique (cross-)validation
observations (or a subset of size
ncutoff of them)
ghat
the empirical CDF of the (cross-)validation
observations hat{G}_n(y)
fbar
the average predictive distribution barF_n(y)
bs
the Brier score BS(y)
The function rstudent.cv.georob returns a numeric vector with
the standardized cross-validation prediction errors.
Gneiting, T., Balabdaoui, F. and Raftery, A. E.(2007) Probabilistic
forecasts, calibration and sharpness. Journal of the Royal Statistical
Society Series B69, 243–268.
See Also
georob for (robust) fitting of spatial linear models;
cv.georob for assessing the goodness of a fit by georob.
Examples
## Not run:
data(meuse)
r.logzn <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y,
variogram.model = "RMexp",
param = c(variance = 0.15, nugget = 0.05, scale = 200),
tuning.psi = 1)
r.logzn.cv.1 <- cv(r.logzn, seed = 1, lgn = TRUE)
r.logzn.cv.2 <- cv(r.logzn, formula = .~. + ffreq, seed = 1, lgn = TRUE)
summary(r.logzn.cv.1, se = TRUE)
summary(r.logzn.cv.2, se = TRUE)
op <- par(mfrow = c(2,2))
plot(r.logzn.cv.1, type = "lgn.sc")
plot(r.logzn.cv.2, type = "lgn.sc", add = TRUE, col = "red")
abline(0, 1, lty= "dotted")
plot(r.logzn.cv.1, type = "ta")
plot(r.logzn.cv.2, type = "ta", add = TRUE, col = "red")
abline(h=0, lty= "dotted")
plot(r.logzn.cv.2, type = "mc", add = TRUE, col = "red")
plot(r.logzn.cv.1, type = "bs")
plot(r.logzn.cv.2, type = "bs", add = TRUE, col = "red")
legend("topright", lty = 1, col = c("black", "red"), bty = "n",
legend = c("log(Zn) ~ sqrt(dist)", "log(Zn) ~ sqrt(dist) + ffreq"))
par(op)
## End(Not run)