R: Predict method for Treed Gaussian process models
predict.tgp
R Documentation
Predict method for Treed Gaussian process models
Description
This generic prediction method was designed to obtain samples
from the posterior predictive distribution after the b*
functions have finished. Samples, or kriging mean and variance
estimates, can be obtained from the MAP model encoded in the
"tgp"-class object, or this parameterization can be used
as a jumping-off point in obtaining further samples from the
joint posterior and posterior predictive distributions
Usage
## S3 method for class 'tgp'
predict(object, XX = NULL, BTE = c(0, 1, 1), R = 1,
MAP = TRUE, pred.n = TRUE, krige = TRUE, zcov = FALSE,
Ds2x = FALSE, improv = FALSE, sens.p = NULL, trace = FALSE,
verb = 0, ...)
Arguments
object
"tgp"-class object that is the output of one of
the b* functions: blm, btlmbgp, bgpllm, btgp, or
btgpllm
XX
Optional data.frame, matrix,
or vector of predictive input locations
with ncol(XX) == ncol(object$X)
BTE
3-vector of Monte-carlo parameters (B)urn in, (T)otal, and
(E)very. Predictive samples are saved every E MCMC rounds starting
at round B, stopping at T. The default BTE=c(0,1,1) is
specified to give the kriging means and variances as outputs, plus
one sample from the posterior predictive distribution
R
Number of repeats or restarts of BTE MCMC rounds,
default R=1 is no restarts
MAP
When TRUE (default) predictive data (i.e.,
kriging mean and variance estimates, and samples from the
posterior predictive distribution) are obtained for the
fixed MAP model encoded in object. Otherwise,
when MAP=FALSE sampling from the joint posterior
of the model parameters (i.e., tree and GPs) and the posterior
predictive distribution are obtained starting from the MAP model and
proceeding just as the b* functions
pred.n
TRUE (default) value results in prediction at
the inputs X; FALSE
skips prediction at X resulting in a faster
implementation
krige
TRUE (default) value results in collection of
kriging means and variances at predictive (and/or data)
locations; FALSE skips the gathering of kriging statistics
giving a savings in storage
zcov
If TRUE then the predictive covariance matrix is
calculated– can be computationally (and memory) intensive if
X or XX is large. Otherwise only the variances
(diagonal of covariance matrices) are calculated (default). See
outputs Zp.s2, ZZ.s2, etc., below
Ds2x
TRUE results in ALC (Active Learning–Cohn)
computation of expected reduction in uncertainty calculations at the
X locations, which can be used for adaptive sampling;
FALSE (default) skips this computation, resulting in
a faster implementation
improv
TRUE results in samples from the
improvement at locations XX with respect to the observed
data minimum. These samples are used to calculate the expected
improvement over XX, as well as to rank all of the points in
XX in the order that they should be sampled to minimize the
expected multivariate improvement (refer to Schonlau et al, 1998).
Alternatively, improv can be set to any positive integer 'g',
in which case the ranking is performed with respect to the expectation
for improvement raised to the power 'g'. Increasing 'g' leads to
rankings that are more oriented towards a global optimization.
The option FALSE (default) skips these computations,
resulting in a faster implementation. Optionally, a two-vector
can be supplied where improv[2] is interpreted as the
(maximum) number of points to rank by improvement.
See the note in btgp documentation.
If not specified, then the larger of 10% of nn = nrow(XX)
and min(10, nn) is taken by default
sens.p
Either NULL or a vector of parameters for
sensitivity analysis, built by the function sens.
Refer there for details
trace
TRUE results in a saving of samples from the
posterior distribution for most of the parameters in the model. The
default is FALSE for speed/storage reasons. See note below
verb
Level of verbosity of R-console print statements: from 0
(default: none); 1 which shows the “progress meter”; 2
includes an echo of initialization parameters; up to 3 and 4 (max)
with more info about successful tree operations
...
Ellipses are not used in the current version
of predict.tgp. They are are only included in order to
maintain S3 generic/method consistency
Details
While this function was designed with prediction in mind, it is
actually far more general. It allows a continuation of
MCMC sampling where the b* function left off (when
MAP=FALSE) with a possibly new set of predictive locations
XX. The intended use of this function is to obtain quick
kriging-style predictions for a previously-fit MAP estimate
(contained in a "tgp"-class object)
on a new set of predictive locations XX. However,
it can also be used simply to extend the search for an MAP model
when MAP=FALSE, pred.n=FALSE, and XX=NULL
Value
The output is the same, or a subset of, the output produced
by the b* functions, for example see btgp
Note
It is important to note that this function is not a replacement
for supplying XX to the b* functions, which is the only
way to get fully Bayesian samples from the posterior prediction
at new inputs. It is only intended as a post-analysis (diagnostic)
tool.
Inputs XX containing NaN, NA, or Inf are
discarded with non-fatal warnings. Upon execution, MCMC reports are
made every 1,000 rounds to indicate progress.
If XXs are provided which fall outside the range of X
inputs provided to the original b* function, then those will
not be extrapolated properly, due to the way that bounding rectangles
are defined in the original run. For a workaround, supply
out$Xsplit <- rbind(X, XX) before running predict on
out.
See note for btgp or another b* function
regarding the handling and appropriate specification of traces.
The "tgp" class output produced by predict.tgp can
also be used as input to predict.tgp, as well as others (e.g.,
plot.tgp.
## revisit the Motorcycle data
require(MASS)
## fit a btgpllm without predictive sampling (for speed)
out <- btgpllm(X=mcycle[,1], Z=mcycle[,2], bprior="b0",
pred.n=FALSE)
## nothing to plot here because there is no predictive data
## save the "tgp" class output object for use later and
save(out, file="out.Rsave")
## then remove it (for illustrative purposes)
out <- NULL
## (now imagine emailing the out.Rsave file to a friend who
## then performs the following in order to use your fitted
## tgp model on his/her own predictive locations)
## load in the "tgp" class object we just saved
load("out.Rsave")
## new predictive locations
XX <- seq(2.4, 56.7, length=200)
## now obtain kriging estimates from the MAP model
out.kp <- predict(out, XX=XX, pred.n=FALSE)
plot(out.kp, center="km", as="ks2")
## actually obtain predictive samples from the MAP
out.p <- predict(out, XX=XX, pred.n=FALSE, BTE=c(0,1000,1))
plot(out.p)
## use the MAP as a jumping-off point for more sampling
out2 <- predict(out, XX, pred.n=FALSE, BTE=c(0,2000,2),
MAP=FALSE, verb=1)
plot(out2)
## (generally you would not want to remove the file)
unlink("out.Rsave")