Generate X and Y values from the 10-dim “first”
Friedman data set used to validate the Multivariate Adaptive
Regression Splines (MARS) model, and a variation involving
boolean indicators. This test function has
three non-linear and interacting variables,
along with two linear, and five which are irrelevant.
The version with indicators has parts of the response
turned on based on the setting of the indicators
Usage
friedman.1.data(n = 100)
fried.bool(n = 100)
Arguments
n
Number of samples desired
Details
In the original formulation, as implemented by friedman.1.data
the function has 10-dim inputs X are drawn from Unif(0,1), and responses
are N(m(X),1) where
m(X) = E[f(X)] and
The variation fried.bool uses indicators
I in 1:4. The function also has 10-dim
inputs X with columns distributed as Unif(0,1) and responses
are N(m(X,I), 1) where
m(X,I) = E[f(X,I)] and
The indicator I is coded in binary in the output data frame as:
c(0,0,0) for I=1,
c(0,0,1) for I=2,
c(0,1,0) for I=3, and
c(1,0,0) for I=4.
Value
Output is a data.frame with columns
X.1, ..., X.10
describing the 10-d randomly sampled inputs
I.1, ..., I.3
boolean version of the indicators provided only
for fried.bool, as described above
Y
sample responses (with N(0,1) noise)
Ytrue
true responses (without noise)
Note
An example using the original version of the data
(friedman.1.data) is contained in the first package vignette:
vignette("tgp"). The boolean version fried.bool
is used in second vignette vignette("tgp2")
Gramacy, R. B. (2007). tgp: An R Package for
Bayesian Nonstationary, Semiparametric Nonlinear Regression
and Design by Treed Gaussian Process Models.
Journal of Statistical Software, 19(9).
http://www.jstatsoft.org/v19/i09
Robert B. Gramacy, Matthew Taddy (2010). Categorical Inputs,
Sensitivity Analysis, Optimization and Importance Tempering with tgp
Version 2, an R Package for Treed Gaussian Process Models.
Journal of Statistical Software, 33(6), 1–48.
http://www.jstatsoft.org/v33/i06/.
Friedman, J. H. (1991).
Multivariate adaptive regression splines.
“Annals of Statistics”, 19, No. 1, 1–67.
Gramacy, R. B., Lee, H. K. H. (2007).
Bayesian treed Gaussian process models with an application to computer modeling
Journal of the American Statistical Association, to appear.
Also available as ArXiv article 0710.4536
http://arxiv.org/abs/0710.4536
Chipman, H., George, E., & McCulloch, R. (2002).
Bayesian treed models.
Machine Learning, 48, 303–324.