Last data update: 2014.03.03

R: Wild Cluster Bootstrapped p-Values For Linear Family GLM
cluster.wild.glmR Documentation

Wild Cluster Bootstrapped p-Values For Linear Family GLM

Description

This software estimates p-values using wild cluster bootstrapped t-statistics for linear family GLM models (Cameron, Gelbach, and Miller 2008). Residuals are repeatedly re-sampled by cluster to form a pseudo-dependent variable, a model is estimated for each re-sampled data set, and inference is based on the sampling distribution of the pivotal (t) statistic. Users may choose whether to impose the null hypothesis for independent variables; the null is never imposed for the intercept or any model that includes factor variables. Confidence intervals are only reported when the null hypothesis is not imposed.

Usage

cluster.wild.glm(mod, dat, cluster, ci.level = 0.95, impose.null = TRUE,
  boot.reps = 1000, report = TRUE, prog.bar = TRUE)

Arguments

mod

A linear (identity link) model estimated using glm.

dat

The data set used to estimate mod.

cluster

A formula of the clustering variable.

ci.level

What confidence level should CIs reflect? (Note: only reported when impose.null == FALSE).

impose.null

Should we impose the null Ho?

boot.reps

The number of bootstrap samples to draw.

report

Should a table of results be printed to the console?

prog.bar

Show a progress bar of the bootstrap (= TRUE) or not (= FALSE).

Value

A list with the elements

p.values

A matrix of the estimated p-values.

ci

A matrix of confidence intervals (if null not imposed).

Note

Code to estimate GLM clustered standard errors by Mahmood Arai: http://thetarzan.wordpress.com/2011/06/11/clustered-standard-errors-in-r/. Cluster SE degrees of freedom correction = (M/(M-1)) with M = the number of clusters.

Author(s)

Justin Esarey

References

Cameron, A. Colin, Jonah B. Gelbach, and Douglas L. Miller. 2008. "Bootstrap-Based Improvements for Inference with Clustered Errors." The Review of Economics and Statistics 90(3): 414-427. <DOI:10.1162/rest.90.3.414>.

Examples

## Not run: 

# example: predict chicken weight
# predict chick weight using diet, do not impose the null hypothesis
# because of factor variable "Diet"
data(ChickWeight)
weight.mod <- glm(formula = weight~Diet,data=ChickWeight)
cluster.wd.w.1 <-cluster.wild.glm(weight.mod, dat = ChickWeight,cluster = ~Chick, boot.reps = 1000)

# impose null
dum <- model.matrix(~ ChickWeight$Diet)
ChickWeight$Diet2 <- as.numeric(dum[,2])
ChickWeight$Diet3 <- as.numeric(dum[,3])
ChickWeight$Diet4 <- as.numeric(dum[,4])

weight.mod2 <- glm(formula = weight~Diet2+Diet3+Diet4,data=ChickWeight)
cluster.wd.w.2 <-cluster.wild.glm(weight.mod2, dat = ChickWeight,cluster = ~Chick, boot.reps = 1000)


## End(Not run)

Results