Last data update: 2014.03.03

R: Bootstrap inference for prespecified models
bootstrap_inferenceR Documentation

Bootstrap inference for prespecified models

Description

Runs B bootstrap samples using a prespecified model then computes the two I estimates based on cross validation. p values of the two I estimates are computed for a given H_0: mu_I_0 = mu_0 and confidence intervals are provided.

Usage

bootstrap_inference(X, y, 
		model_string,
		predict_string = "predict(mod, obs_left_out)",
		cleanup_mod_function = NA,
		y_higher_is_better = TRUE,
		verbose = TRUE,
		full_verbose = FALSE,
		H_0_mu_equals = 0,
		pct_leave_out = 0.10,
		B = 3000,
		alpha = 0.05,
		plot = TRUE,
        num_cores = 1,
        ...)

Arguments

X

A n x p dataframe of covariates.

y

An n-length numeric vector which is the response

model_string

A string of R code that will be evaluated to construct the leave one out model. Make sure the covariate data is referred to as Xyleft.

predict_string

A string of R code that will be evaluated on left out data after the model is built with the training data. Make sure the forecast data (the left one out data) is referred to as obs_left_out and the model is referred to as mod.

cleanup_mod_function

A function that is called at the end of a cross validation iteration to cleanup the model in some way.

y_higher_is_better

True if a response value being higher is clinically "better" than one that is lower (e.g. cognitive ability in a drug trial for the mentally ill). False if the response value being lower is clinically "better" than one that is higher (e.g. amount of weight lost in a weight-loss trial). Default is TRUE.

verbose

Prints out a dot for each bootstrap sample. This only works on some platforms.

full_verbose

Prints out full information for each cross validation model for each bootstrap sample. This only works on some platforms.

H_0_mu_equals

The mu_I_0 value in H_0. Default is 0 which answers the question: does my allocation procedure do better than a naive allocation procedure.

pct_leave_out

In the cross-validation, the proportion of the original dataset left out to estimate out-of-sample metrics. The default is 0.1 which corresponds to 10-fold cross validation.

B

The number of bootstrap samples to take. We recommend making this as high as you can tolerate given speed considerations. The default is 3000.

alpha

Defines the confidence interval size (1 - alpha). Defaults to 0.05.

plot

Illustrates the estimate, the bootstrap samples and the confidence intervals on a histogram plot. Default to TRUE.

num_cores

The number of cores to use in parallel to run the bootstrap samples more rapidly. Defaults to serial by using 1 core.

...

Additional parameters to be sent to the model constructor. Note that if you wish to pass these parameters, "..." must be specified in model_string.

Value

Returns a list object containing results of the procedure.

Author(s)

Adam Kapelner and Justin Bleich

References

Kapelner, A, Bleich, J, Cohen, ZD, DeRubeis, RJ and Berk, R (2014) Inference for Treatment Regime Models in Personalized Medicine, arXiv

Examples

	beta0 = 1
	beta1 = -1
	gamma0 = 0
	gamma1 = sqrt(2 * pi)
	mu_x = 0
	sigsq_x = 1
	sigsq_e = 1
	num_boot = 20 #for speed only
	n = 50 #for speed only
	
	x = sort(rnorm(n, mu_x, sigsq_x))
	noise = rnorm(n, 0, sigsq_e)
	
	treatment = sample(c(rep(1, n / 2), rep(0, n / 2)))
	y = beta0 + beta1 * x + treatment * (gamma0 + gamma1 * x) + noise
	
	X = data.frame(treatment, x)
	
	res = bootstrap_inference(X, y,
			"lm(y ~ . + treatment * ., data = Xyleft)",
			num_cores = 1,
			B = num_boot)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(PTE)
Loading required package: doParallel
Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
Welcome to PTE v1.0 by Adam Kapelner and Justin Bleich

> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/PTE/bootstrap_inference.Rd_%03d_medium.png", width=480, height=480)
> ### Name: bootstrap_inference
> ### Title: Bootstrap inference for prespecified models
> ### Aliases: bootstrap_inference
> 
> ### ** Examples
> 
> 	beta0 = 1
> 	beta1 = -1
> 	gamma0 = 0
> 	gamma1 = sqrt(2 * pi)
> 	mu_x = 0
> 	sigsq_x = 1
> 	sigsq_e = 1
> 	num_boot = 20 #for speed only
> 	n = 50 #for speed only
> 	
> 	x = sort(rnorm(n, mu_x, sigsq_x))
> 	noise = rnorm(n, 0, sigsq_e)
> 	
> 	treatment = sample(c(rep(1, n / 2), rep(0, n / 2)))
> 	y = beta0 + beta1 * x + treatment * (gamma0 + gamma1 * x) + noise
> 	
> 	X = data.frame(treatment, x)
> 	
> 	res = bootstrap_inference(X, y,
+ 			"lm(y ~ . + treatment * ., data = Xyleft)",
+ 			num_cores = 1,
+ 			B = num_boot)

> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>