R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Predict method for Treed Gaussian process models

predict.tgp

R Documentation

Predict method for Treed Gaussian process models

Description

This generic prediction method was designed to obtain samples from the posterior predictive distribution after the b* functions have finished. Samples, or kriging mean and variance estimates, can be obtained from the MAP model encoded in the "tgp"-class object, or this parameterization can be used as a jumping-off point in obtaining further samples from the joint posterior and posterior predictive distributions

Usage

## S3 method for class 'tgp'
predict(object, XX = NULL, BTE = c(0, 1, 1), R = 1,
            MAP = TRUE, pred.n = TRUE, krige = TRUE, zcov = FALSE,
            Ds2x = FALSE, improv = FALSE, sens.p = NULL, trace = FALSE,
            verb = 0, ...)

Arguments

`object`	`"tgp"`-class object that is the output of one of the `b*` functions: `blm`, `btlm` `bgp`, `bgpllm`, `btgp`, or `btgpllm`
`XX`	Optional `data.frame`, `matrix`, or vector of predictive input locations with `ncol(XX) == ncol(object$X)`
`BTE`	3-vector of Monte-carlo parameters (B)urn in, (T)otal, and (E)very. Predictive samples are saved every E MCMC rounds starting at round B, stopping at T. The default `BTE=c(0,1,1)` is specified to give the kriging means and variances as outputs, plus one sample from the posterior predictive distribution
`R`	Number of repeats or restarts of `BTE` MCMC rounds, default `R=1` is no restarts
`MAP`	When `TRUE` (default) predictive data (i.e., kriging mean and variance estimates, and samples from the posterior predictive distribution) are obtained for the fixed MAP model encoded in `object`. Otherwise, when `MAP=FALSE` sampling from the joint posterior of the model parameters (i.e., tree and GPs) and the posterior predictive distribution are obtained starting from the MAP model and proceeding just as the `b*` functions
`pred.n`	`TRUE` (default) value results in prediction at the inputs `X`; `FALSE` skips prediction at `X` resulting in a faster implementation
`krige`	`TRUE` (default) value results in collection of kriging means and variances at predictive (and/or data) locations; `FALSE` skips the gathering of kriging statistics giving a savings in storage
`zcov`	If `TRUE` then the predictive covariance matrix is calculated– can be computationally (and memory) intensive if `X` or `XX` is large. Otherwise only the variances (diagonal of covariance matrices) are calculated (default). See outputs `Zp.s2`, `ZZ.s2`, etc., below
`Ds2x`	`TRUE` results in ALC (Active Learning–Cohn) computation of expected reduction in uncertainty calculations at the `X` locations, which can be used for adaptive sampling; `FALSE` (default) skips this computation, resulting in a faster implementation
`improv`	`TRUE` results in samples from the improvement at locations `XX` with respect to the observed data minimum. These samples are used to calculate the expected improvement over `XX`, as well as to rank all of the points in `XX` in the order that they should be sampled to minimize the expected multivariate improvement (refer to Schonlau et al, 1998). Alternatively, `improv` can be set to any positive integer 'g', in which case the ranking is performed with respect to the expectation for improvement raised to the power 'g'. Increasing 'g' leads to rankings that are more oriented towards a global optimization. The option `FALSE` (default) skips these computations, resulting in a faster implementation. Optionally, a two-vector can be supplied where `improv[2]` is interpreted as the (maximum) number of points to rank by improvement. See the note in `btgp` documentation. If not specified, then the larger of 10% of `nn = nrow(XX)` and `min(10, nn)` is taken by default
`sens.p`	Either `NULL` or a vector of parameters for sensitivity analysis, built by the function `sens`. Refer there for details
`trace`	`TRUE` results in a saving of samples from the posterior distribution for most of the parameters in the model. The default is `FALSE` for speed/storage reasons. See note below
`verb`	Level of verbosity of R-console print statements: from 0 (default: none); 1 which shows the “progress meter”; 2 includes an echo of initialization parameters; up to 3 and 4 (max) with more info about successful tree operations
`...`	Ellipses are not used in the current version of `predict.tgp`. They are are only included in order to maintain S3 generic/method consistency

Details

While this function was designed with prediction in mind, it is actually far more general. It allows a continuation of MCMC sampling where the b* function left off (when MAP=FALSE) with a possibly new set of predictive locations XX. The intended use of this function is to obtain quick kriging-style predictions for a previously-fit MAP estimate (contained in a "tgp"-class object) on a new set of predictive locations XX. However, it can also be used simply to extend the search for an MAP model when MAP=FALSE, pred.n=FALSE, and XX=NULL

Value

The output is the same, or a subset of, the output produced by the b* functions, for example see btgp

Note

It is important to note that this function is not a replacement for supplying XX to the b* functions, which is the only way to get fully Bayesian samples from the posterior prediction at new inputs. It is only intended as a post-analysis (diagnostic) tool.

Inputs XX containing NaN, NA, or Inf are discarded with non-fatal warnings. Upon execution, MCMC reports are made every 1,000 rounds to indicate progress.

If XXs are provided which fall outside the range of X inputs provided to the original b* function, then those will not be extrapolated properly, due to the way that bounding rectangles are defined in the original run. For a workaround, supply out$Xsplit <- rbind(X, XX) before running predict on out.

See note for btgp or another b* function regarding the handling and appropriate specification of traces.

The "tgp" class output produced by predict.tgp can also be used as input to predict.tgp, as well as others (e.g., plot.tgp.

Author(s)

Robert B. Gramacy, rbgramacy@chicagobooth.edu, and Matt Taddy, taddy@chicagobooth.edu

References

http://bobby.gramacy.com/r_packages/tgp

Examples

## revisit the Motorcycle data
require(MASS)

## fit a btgpllm without predictive sampling (for speed)
out <- btgpllm(X=mcycle[,1], Z=mcycle[,2], bprior="b0", 
	       pred.n=FALSE)
## nothing to plot here because there is no predictive data

## save the "tgp" class output object for use later and
save(out, file="out.Rsave")

## then remove it (for illustrative purposes)
out <- NULL

## (now imagine emailing the out.Rsave file to a friend who
## then performs the following in order to use your fitted
## tgp model on his/her own predictive locations)

## load in the "tgp" class object we just saved
load("out.Rsave")

## new predictive locations
XX <- seq(2.4, 56.7, length=200)

## now obtain kriging estimates from the MAP model
out.kp <- predict(out, XX=XX, pred.n=FALSE)
plot(out.kp, center="km", as="ks2")

## actually obtain predictive samples from the MAP
out.p <- predict(out, XX=XX, pred.n=FALSE, BTE=c(0,1000,1))
plot(out.p)

## use the MAP as a jumping-off point for more sampling
out2 <- predict(out, XX, pred.n=FALSE, BTE=c(0,2000,2),
                MAP=FALSE, verb=1)
plot(out2)

## (generally you would not want to remove the file)
unlink("out.Rsave")