R: Sequential Treed D-Optimal Design for Treed Gaussian Process...
tgp.design
R Documentation
Sequential Treed D-Optimal Design for Treed Gaussian Process Models
Description
Based on the maximum a' posteriori (MAP)
treed partition extracted from a "tgp"-class object,
calculate independent sequential treed D-Optimal designs in each of the regions.
Usage
tgp.design(howmany, Xcand, out, iter = 5000, verb = 0)
Arguments
howmany
Number of new points in the design. Must
be less than the number of candidates contained in
Xcand, i.e., howmany <= nrow(Xcand)
Xcand
data.frame, matrix or vector of candidates
from which new design points are subsampled. Must have
nrow(Xcand) == nrow(out$X)
out
"tgp"-class object output from one of the
model functions which has tree support, e.g., btgpllm,
btgp, btlm
iter
number of iterations of stochastic accent algorithm,
default 5000
verb
positive integer indicating after how many rounds of
stochastic approximation in dopt.gp
to print each progress statement;
default verb=0 results in no printing
Details
This function partitions Xcand and out$X based on
the MAP tree (obtained on "tgp"-class out with
partition) and calls
dopt.gp in order to obtain a D-optimal design under
independent stationary Gaussian processes models defined in each
region. The aim is to obtain a design where new points from Xcand
are spaced out relative to themselves, and relative to
the existing locations (out$X) in the region.
The number of new points from each region of the partition is
proportional to the number of candidates Xcand in the region.
Value
Output is a list of data.frames containing XX design
points for each region of the MAP tree in out
Note
Input Xcand containing NaN, NA, Inf are discarded with non-fatal
warnings
D-Optimal computation in each region is preceded by a print statement
indicated the number of new locations to be chosen and the number of candidates
in the region. Other than that, there are no other indicators of progress.
You will have to be patient.
Creating treed sequential D-optimal designs is no speedy task. At least it
faster than the non-treed version (see dopt.gp).
The example below is also part of vignette("tgp").
Please see vignette("tgp2") for a similar example based on
optimization using the optim.step.tgp
Gramacy, R. B. (2007). tgp: An R Package for Bayesian
Nonstationary, Semiparametric Nonlinear Regression and Design by
Treed Gaussian Process Models. Journal of Statistical Software, 19(9).
http://www.jstatsoft.org/v19/i09
Robert B. Gramacy, Matthew Taddy (2010). Categorical Inputs,
Sensitivity Analysis, Optimization and Importance Tempering with tgp
Version 2, an R Package for Treed Gaussian Process Models.
Journal of Statistical Software, 33(6), 1–48.
http://www.jstatsoft.org/v33/i06/.
Gramacy, R. B., Lee, H. K. H. (2006).
Adaptive design and analysis of supercomputer experiments.
Technometrics, to appear. Also avaliable on ArXiv article 0805.4359
http://arxiv.org/abs/0805.4359
Gramacy, R. B., Lee, H. K. H., & Macready, W. (2004).
Parameter space exploration with Gaussian process trees.
ICML (pp. 353–360). Omnipress & ACM Digital Library.
#
# 2-d Exponential data
# (This example is based on random data.
# It might be fun to run it a few times)
#
# get the data
exp2d.data <- exp2d.rand()
X <- exp2d.data$X; Z <- exp2d.data$Z
Xcand <- exp2d.data$XX
# fit treed GP LLM model to data w/o prediction
# basically just to get MAP tree (and plot it)
out <- btgpllm(X=X, Z=Z, pred.n=FALSE, corr="exp")
tgp.trees(out)
# find a treed sequential D-Optimal design
# with 10 more points. It is interesting to
# contrast this design with one obtained via
# the dopt.gp function
XX <- tgp.design(10, Xcand, out)
# now fit the model again in order to assess
# the predictive surface at those new design points
dout <- btgpllm(X=X, Z=Z, XX=XX, corr="exp")
plot(dout)