Runs B bootstrap samples using a prespecified model then computes the two I estimates based on cross validation.
p values of the two I estimates are computed for a given H_0: mu_I_0 = mu_0 and
confidence intervals are provided.
A string of R code that will be evaluated to construct the leave one out model. Make sure the covariate data is
referred to as Xyleft.
predict_string
A string of R code that will be evaluated on left out data after the model is built with the training data. Make sure
the forecast data (the left one out data) is referred to as obs_left_out and the model is referred to as mod.
cleanup_mod_function
A function that is called at the end of a cross validation iteration to cleanup the model
in some way.
y_higher_is_better
True if a response value being higher is clinically "better" than one that is lower (e.g. cognitive ability in a drug trial for the
mentally ill). False if the response value being lower is clinically "better" than one that is higher (e.g. amount of weight lost
in a weight-loss trial). Default is TRUE.
verbose
Prints out a dot for each bootstrap sample. This only works on some platforms.
full_verbose
Prints out full information for each cross validation model for each bootstrap sample. This only works on some platforms.
H_0_mu_equals
The mu_I_0 value in H_0. Default is 0 which answers the question: does my allocation procedure do better than a naive
allocation procedure.
pct_leave_out
In the cross-validation, the proportion of the original dataset left out to estimate out-of-sample metrics. The default is 0.1
which corresponds to 10-fold cross validation.
B
The number of bootstrap samples to take. We recommend making this as high as you can tolerate given speed considerations.
The default is 3000.
alpha
Defines the confidence interval size (1 - alpha). Defaults to 0.05.
plot
Illustrates the estimate, the bootstrap samples and the confidence intervals on a histogram plot. Default to TRUE.
num_cores
The number of cores to use in parallel to run the bootstrap samples more rapidly. Defaults to serial by using 1 core.
...
Additional parameters to be sent to the model constructor. Note that if you wish to pass these parameters,
"..." must be specified in model_string.
Value
Returns a list object containing results of the procedure.
Author(s)
Adam Kapelner and Justin Bleich
References
Kapelner, A, Bleich, J, Cohen, ZD, DeRubeis, RJ and Berk, R (2014) Inference for Treatment Regime Models in Personalized Medicine, arXiv
Examples
beta0 = 1
beta1 = -1
gamma0 = 0
gamma1 = sqrt(2 * pi)
mu_x = 0
sigsq_x = 1
sigsq_e = 1
num_boot = 20 #for speed only
n = 50 #for speed only
x = sort(rnorm(n, mu_x, sigsq_x))
noise = rnorm(n, 0, sigsq_e)
treatment = sample(c(rep(1, n / 2), rep(0, n / 2)))
y = beta0 + beta1 * x + treatment * (gamma0 + gamma1 * x) + noise
X = data.frame(treatment, x)
res = bootstrap_inference(X, y,
"lm(y ~ . + treatment * ., data = Xyleft)",
num_cores = 1,
B = num_boot)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(PTE)
Loading required package: doParallel
Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
Welcome to PTE v1.0 by Adam Kapelner and Justin Bleich
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/PTE/bootstrap_inference.Rd_%03d_medium.png", width=480, height=480)
> ### Name: bootstrap_inference
> ### Title: Bootstrap inference for prespecified models
> ### Aliases: bootstrap_inference
>
> ### ** Examples
>
> beta0 = 1
> beta1 = -1
> gamma0 = 0
> gamma1 = sqrt(2 * pi)
> mu_x = 0
> sigsq_x = 1
> sigsq_e = 1
> num_boot = 20 #for speed only
> n = 50 #for speed only
>
> x = sort(rnorm(n, mu_x, sigsq_x))
> noise = rnorm(n, 0, sigsq_e)
>
> treatment = sample(c(rep(1, n / 2), rep(0, n / 2)))
> y = beta0 + beta1 * x + treatment * (gamma0 + gamma1 * x) + noise
>
> X = data.frame(treatment, x)
>
> res = bootstrap_inference(X, y,
+ "lm(y ~ . + treatment * ., data = Xyleft)",
+ num_cores = 1,
+ B = num_boot)
>
>
>
>
>
> dev.off()
null device
1
>