(optional) a given true numeric value of A for Gaussian data or of r for Binomial and Poisson data. If not designated, the estimated value in the gbp.object object will be considered as a true value.
reg.coef
(optional) a given true (m by 1) vector for regression coefficients, β, where m is the number of regression coefficients including an intercept. If not designated, the estimated value in the gbp.object object will be considered as a true value.
mean.PriorDist
(optional) a given true numeric value for the mean of (second-level) prior distribution. If not designated, the previously known value in the gbp.object object will be considered as a known prior mean.
nsim
number of datasets to be generated. Default is 100.
Details
As for the argument gbp.object, if the result of gbp is designated to
b, for example "b <- gbp(z, n, model = "binomial")", the argument gbp.object indicates this b.
Data generating process is based on a second-level hierarchical model. The first-level hierarchy is
a distribution of observed data and the second-level is a conjugate prior distribution
on the first-level parameter.
To be specific, for Normal data, gbp constructs a two-level Normal-Normal 2-level model. σ_j^2 below is assumed to be known or to be accurately estimated (s^2) and subscript j indicates j-th group in a dataset.
(y_j | θ_j) ~ indep N(θ_j, σ_j^2)
(θ_j | μ0_j, A) ~ indep N(μ0_j, A)
μ0_j = x_j'β
for j = 1, …, k, where k is the number of groups (units) in a dataset.
For Poisson data, gbp builds a two-level Poisson-Gamma multi-level model. A square bracket below indicates [mean, variance] of distribution and a constant multiplied to the notation representing Gamma distribution (Gam) is a scale. Also, for consistent notation, y_j = z_j / n_j and n_j can be interpreted as j-th group's exposure only in this Poisson-Gamma hierarchical model.
for j = 1, …, k, where k is the number of groups (units) in a dataset.
For Binomial data, gbp sets a two-level Binomial-Beta multi-level model. For reference, a square bracket below indicates [mean, variance] of distribution and y_j = z_j / n_j.
for j = 1, …, k, where k is the number of groups (units) in a dataset.
From now on, the subscript (i) means i-th simulation and the subscript j indicates j-th group. So, notations with a subscript (i) are (k by 1) vectors, for example θ_i' = (\thate_(i)1, θ_(i)2, ..., θ_(i)k).
Pseudo-data generating process starts from the second-level hierarchy to the first-level. coverage first generates true parameters (θ_(i)) for k groups at the second-level and then moves onto the first-level to simulate pseudo-data sets, y_(i) for Gaussian or z_(i) for Binomial and Poisson data, given previously generated true parameters (θ_(i)).
So, in order to generate pseudo-datasets, coverage needs parameters of prior distribution,
(A (or r) and β (reg.coef))
or (A (or r) and μ0). From here, we have four options to run coverage.
First, if any values related to the prior distribution are not designated like
coverage(b, nsim = 10), then coverage will regard estimated values (or known prior mean, μ0) in b (gbp.object) as given true values when it generates lots of pseudo-datasets. After sampling θ_(i) from the prior distribution determined by these estimated values (or known prior mean) in b (gbp.object), coverage creates an i-th pseudo-dataset based on θ_(i) just sampled.
Second, coverage allows us to try different true values in generating datasets. Suppose gbp.object is based on the model with a known prior mean, μ0. Then, we can try either different A.or.r or mean.PriorDist. For example, coverage(b, A.or.r = 20, nsim = 10), coverage(b, mean.PriorDist = 0.5, nsim = 10), or coverage(b, A.or.r = 20, mean.PriorDist = 0.5, nsim = 10). Note that we cannot set reg.coef because the second-level mean (prior mean) is known in gbp.object to begin with.
Suppose gbp.object is based on the model with an unknown prior mean. In this case, gbp.object has the estimation result of regression model, linear regression for Normal-Normal, log-linear regression for Poisson-Gamma, or logistic regression for Binomial-Beta, (only intercept term if there is no covariate) to estimate the unknown prior mean. Then, we can try some options: one or two of (A.or.r, mean.PriorDist, reg.coef). For example, coverage(b, A.or.r = 20, nsim = 10), coverage(b, mean.PriorDist = 0.5, nsim = 10), or coverage(b, reg.coef = 0.1, nsim = 10) with no covariate where reg.coef is a designated intercept term. Estimates in gbp.object will be used for undesignated values. Also, we can try appropriate combinations of two arguments. For example, coverage(b, A.or.r = 20, mean.PriorDist = 0.5, nsim = 10) and coverage(b, A.or.r = 20, reg.coef = 0.1, nsim = 10). If we have one covariate, a 2 by 1 vector should be designated for reg.coef, one for an intercept term and the other for a regression coefficient of the covariate. Note that the two arguments, mean.PriorDist and reg.coef, cannot be assigned together because we do not need reg.coef given mean.PriorDist.
The simple unbiased estimator of coverage probability in j-th group is a sample mean of indicators over all simulated datasets. The j-th indicator in i-th simulation is 1 if the estimated interval of the j-th group on i-th simulated dataset contains a true parameter
θ_(i)j that generated the observed value of the j-th group in the
i-th dataset.
Rao-Blackwellized unbiased estimator for group j is a conditional expectation of the simple unbiased estimator given a sufficient statistic, y_j for Gaussian or z_j for Binomial and Poisson data.
Value
coverageRB
Rao-Blackwellized unbiased coverage estimate for each group averaged over all simulations.
coverageS
Simple unbiased coverage estimate for each group averaged over all simulations.
average.coverageRB
Overall Rao-Blackwellized unbiased coverage estimate across all the groups and simulations.
overall.coverageRB
Overall Rao-Blackwellized unbiased coverage estimate across all the groups and simulations.
average.coverageS
Overall simple unbiased coverage estimate across all the groups and simulations.
se.coverageRB
Standard error of Rao-Blackwellized unbiased coverage estimate for each group.
se.overall.coverageRB
Standard error of the overall Rao-Blackwellized unbiased coverage estimate.
se.coverageS
Standard error of simple unbiased coverage estimate for each group.
raw.resultRB
All the Rao-Blackwellized unbiased coverage estimates for every group and for every simulation.
raw.resultS
All the simple unbiased coverage estimates for every group and for every simulation.
confidence.lvl
Nominal confidence level
effective.n
The number of simulated data sets used to calculate the coverage estimates. The data sets may cause some errors in fitting models. For example, the data set may be against the conditions for the posteiror propriety in Binomial data.
model
The model being used, "br", "pr", or "gr".
case
One of the cases used to re-draw the coverage plot by coverage.plot.
betas
The regression coefficient used to generate simulated data sets.
A.r
The hyper-parameter value (A for Gaussian model, and r for both Binomial and Poisson models) used to generate simulated data sets.
priormeanused
The value of the prior mean(s) used to generate simulated data sets.
Author(s)
Hyungsuk Tak, Joseph Kelly, and Carl Morris
References
Christiansen, C. and Morris, C. (1997). Hierarchical Poisson Regression Modeling. Journal of the American Statistical Association. 92. 618-632.
Examples
# Loading datasets
data(schools)
y <- schools$y
se <- schools$se
# Arbitrary covariate for schools data
x2 <- rep(c(-1, 0, 1, 2), 2)
# baseball data where z is Hits and n is AtBats
z <- c(18, 17, 16, 15, 14, 14, 13, 12, 11, 11, 10, 10, 10, 10, 10, 9, 8, 7)
n <- c(45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45)
# One covariate: 1 if a player is an outfielder and 0 otherwise
x1 <- c(1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0)
#################################################################
# Gaussian Regression Interactive Multi-level Modeling (GRIMM) #
#################################################################
####################################################################################
# If we do not have any covariate and do not know a mean of the prior distribution #
####################################################################################
g <- gbp(y, se, model = "gaussian")
### when we want to simulate pseudo datasets considering the estimated values
### as true ones.
gcv <- coverage(g, nsim = 10)
### gcv$coverageRB, gcv$coverageS, gcv$average.coverageRB, gcv$average.coverageS,
### gcv$minimum.coverageRB, gcv$raw.resultRB, gcv$raw.resultS
### gcv <- coverage(g, mean.PriorDist = 3, nsim = 100)
### gcv <- coverage(g, A.or.r = 150, nsim = 100)
### gcv <- coverage(g, reg.coef = 10, nsim = 100)
### gcv <- coverage(g, A.or.r = 150, mean.PriorDist = 3, nsim = 100)
### gcv <- coverage(g, A.or.r = 150, reg.coef = 10, nsim = 100)
##################################################################################
# If we have one covariate and do not know a mean of the prior distribution yet, #
##################################################################################
g <- gbp(y, se, x2, model = "gaussian")
### when we want to simulate pseudo datasets considering the estimated values
### as true ones.
gcv <- coverage(g, nsim = 10)
### gcv$coverageRB, gcv$coverageS, gcv$average.coverageRB, gcv$average.coverageS,
### gcv$minimum.coverageRB, gcv$raw.resultRB, gcv$raw.resultS
### gcv <- coverage(g, mean.PriorDist = 3, nsim = 100)
### gcv <- coverage(g, A.or.r = 200, nsim = 100)
### gcv <- coverage(g, reg.coef = c(10, 2), nsim = 100)
### gcv <- coverage(g, A.or.r = 200, mean.PriorDist = 3, nsim = 100)
### gcv <- coverage(g, A.or.r = 200, reg.coef = c(10, 2), nsim = 100)
################################################
# If we know a mean of the prior distribution, #
################################################
g <- gbp(y, se, mean.PriorDist = 8, model = "gaussian")
### when we want to simulate pseudo datasets considering the estimated values
### as true ones.
gcv <- coverage(g, nsim = 10)
### gcv$coverageRB, gcv$coverageS, gcv$average.coverageRB, gcv$average.coverageS,
### gcv$minimum.coverageRB, gcv$raw.resultRB, gcv$raw.resultS
### gcv <- coverage(g, mean.PriorDist = 3, nsim = 100)
### gcv <- coverage(g, A.or.r = 150, nsim = 100)
### gcv <- coverage(g, A.or.r = 150, mean.PriorDist = 3, nsim = 100)
################################################################
# Binomial Regression Interactive Multi-level Modeling (BRIMM) #
################################################################
####################################################################################
# If we do not have any covariate and do not know a mean of the prior distribution #
####################################################################################
b <- gbp(z, n, model = "binomial")
### when we want to simulate pseudo datasets considering the estimated values
### as true ones.
bcv <- coverage(b, nsim = 10)
### bcv$coverageRB, bcv$coverageS, bcv$average.coverageRB, bcv$average.coverageS,
### bcv$minimum.coverageRB, bcv$raw.resultRB, bcv$raw.resultS
### bcv <- coverage(b, mean.PriorDist = 0.2, nsim = 100)
### bcv <- coverage(b, A.or.r = 50, nsim = 100)
### bcv <- coverage(b, reg.coef = -1.5, nsim = 100)
### bcv <- coverage(b, A.or.r = 50, mean.PriorDist = 0.2, nsim = 100)
### bcv <- coverage(b, A.or.r = 50, reg.coef = -1.5, nsim = 100)
##################################################################################
# If we have one covariate and do not know a mean of the prior distribution yet, #
##################################################################################
b <- gbp(z, n, x1, model = "binomial")
### when we want to simulate pseudo datasets considering the estimated values
### as true ones.
bcv <- coverage(b, nsim = 10)
### bcv$coverageRB, bcv$coverageS, bcv$average.coverageRB, bcv$average.coverageS,
### bcv$minimum.coverageRB, bcv$raw.resultRB, bcv$raw.resultS
### bcv <- coverage(b, mean.PriorDist = 0.2, nsim = 100)
### bcv <- coverage(b, A.or.r = 50, nsim = 100)
### bcv <- coverage(b, reg.coef = c(-1.5, 0), nsim = 100)
### bcv <- coverage(b, A.or.r = 40, mean.PriorDist = 0.2, nsim = 100)
### bcv <- coverage(b, A.or.r = 40, reg.coef = c(-1.5, 0), nsim = 100)
################################################
# If we know a mean of the prior distribution, #
################################################
b <- gbp(z, n, mean.PriorDist = 0.265, model = "binomial")
### when we want to simulate pseudo datasets considering the estimated values
### as true ones.
bcv <- coverage(b, nsim = 10)
### bcv$coverageRB, bcv$coverageS, bcv$average.coverageRB, bcv$average.coverageS,
### bcv$minimum.coverageRB, bcv$raw.resultRB, bcv$raw.resultS
### bcv <- coverage(b, mean.PriorDist = 0.2, nsim = 100)
### bcv <- coverage(b, A.or.r = 50, nsim = 100)
### bcv <- coverage(b, A.or.r = 40, mean.PriorDist = 0.2, nsim = 100)
###############################################################
# Poisson Regression Interactive Multi-level Modeling (PRIMM) #
###############################################################
################################################
# If we know a mean of the prior distribution, #
################################################
p <- gbp(z, n, mean.PriorDist = 0.265, model = "poisson")
### when we want to simulate pseudo datasets considering the estimated values
### as true ones.
pcv <- coverage(p, nsim = 10)
### pcv$coverageRB, pcv$coverageS, pcv$average.coverageRB, pcv$average.coverageS,
### pcv$minimum.coverageRB, pcv$raw.resultRB, pcv$raw.resultS
### pcv <- coverage(p, mean.PriorDist = 0.265, nsim = 100)
### pcv <- coverage(p, A.or.r = 150, nsim = 100)
### pcv <- coverage(p, A.or.r = 150, mean.PriorDist = 0.265, nsim = 100)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(Rgbp)
Loading required package: sn
Loading required package: stats4
Attaching package: 'sn'
The following object is masked from 'package:stats':
sd
Loading required package: mnormt
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/Rgbp/coverage.Rd_%03d_medium.png", width=480, height=480)
> ### Name: coverage
> ### Title: Estimating Coverage Probability
> ### Aliases: coverage
> ### Keywords: methods
>
> ### ** Examples
>
>
> # Loading datasets
> data(schools)
> y <- schools$y
> se <- schools$se
>
> # Arbitrary covariate for schools data
> x2 <- rep(c(-1, 0, 1, 2), 2)
>
> # baseball data where z is Hits and n is AtBats
> z <- c(18, 17, 16, 15, 14, 14, 13, 12, 11, 11, 10, 10, 10, 10, 10, 9, 8, 7)
> n <- c(45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45)
>
> # One covariate: 1 if a player is an outfielder and 0 otherwise
> x1 <- c(1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0)
>
> #################################################################
> # Gaussian Regression Interactive Multi-level Modeling (GRIMM) #
> #################################################################
>
> ####################################################################################
> # If we do not have any covariate and do not know a mean of the prior distribution #
> ####################################################################################
>
> g <- gbp(y, se, model = "gaussian")
>
> ### when we want to simulate pseudo datasets considering the estimated values
> ### as true ones.
> gcv <- coverage(g, nsim = 10)
>
> ### gcv$coverageRB, gcv$coverageS, gcv$average.coverageRB, gcv$average.coverageS,
> ### gcv$minimum.coverageRB, gcv$raw.resultRB, gcv$raw.resultS
>
> ### gcv <- coverage(g, mean.PriorDist = 3, nsim = 100)
> ### gcv <- coverage(g, A.or.r = 150, nsim = 100)
> ### gcv <- coverage(g, reg.coef = 10, nsim = 100)
> ### gcv <- coverage(g, A.or.r = 150, mean.PriorDist = 3, nsim = 100)
> ### gcv <- coverage(g, A.or.r = 150, reg.coef = 10, nsim = 100)
>
> ##################################################################################
> # If we have one covariate and do not know a mean of the prior distribution yet, #
> ##################################################################################
>
> g <- gbp(y, se, x2, model = "gaussian")
>
> ### when we want to simulate pseudo datasets considering the estimated values
> ### as true ones.
> gcv <- coverage(g, nsim = 10)
>
> ### gcv$coverageRB, gcv$coverageS, gcv$average.coverageRB, gcv$average.coverageS,
> ### gcv$minimum.coverageRB, gcv$raw.resultRB, gcv$raw.resultS
>
> ### gcv <- coverage(g, mean.PriorDist = 3, nsim = 100)
> ### gcv <- coverage(g, A.or.r = 200, nsim = 100)
> ### gcv <- coverage(g, reg.coef = c(10, 2), nsim = 100)
> ### gcv <- coverage(g, A.or.r = 200, mean.PriorDist = 3, nsim = 100)
> ### gcv <- coverage(g, A.or.r = 200, reg.coef = c(10, 2), nsim = 100)
>
> ################################################
> # If we know a mean of the prior distribution, #
> ################################################
>
> g <- gbp(y, se, mean.PriorDist = 8, model = "gaussian")
>
> ### when we want to simulate pseudo datasets considering the estimated values
> ### as true ones.
> gcv <- coverage(g, nsim = 10)
>
> ### gcv$coverageRB, gcv$coverageS, gcv$average.coverageRB, gcv$average.coverageS,
> ### gcv$minimum.coverageRB, gcv$raw.resultRB, gcv$raw.resultS
>
> ### gcv <- coverage(g, mean.PriorDist = 3, nsim = 100)
> ### gcv <- coverage(g, A.or.r = 150, nsim = 100)
> ### gcv <- coverage(g, A.or.r = 150, mean.PriorDist = 3, nsim = 100)
>
> ################################################################
> # Binomial Regression Interactive Multi-level Modeling (BRIMM) #
> ################################################################
>
> ####################################################################################
> # If we do not have any covariate and do not know a mean of the prior distribution #
> ####################################################################################
>
> b <- gbp(z, n, model = "binomial")
>
> ### when we want to simulate pseudo datasets considering the estimated values
> ### as true ones.
> bcv <- coverage(b, nsim = 10)
>
> ### bcv$coverageRB, bcv$coverageS, bcv$average.coverageRB, bcv$average.coverageS,
> ### bcv$minimum.coverageRB, bcv$raw.resultRB, bcv$raw.resultS
>
> ### bcv <- coverage(b, mean.PriorDist = 0.2, nsim = 100)
> ### bcv <- coverage(b, A.or.r = 50, nsim = 100)
> ### bcv <- coverage(b, reg.coef = -1.5, nsim = 100)
> ### bcv <- coverage(b, A.or.r = 50, mean.PriorDist = 0.2, nsim = 100)
> ### bcv <- coverage(b, A.or.r = 50, reg.coef = -1.5, nsim = 100)
>
> ##################################################################################
> # If we have one covariate and do not know a mean of the prior distribution yet, #
> ##################################################################################
>
> b <- gbp(z, n, x1, model = "binomial")
>
> ### when we want to simulate pseudo datasets considering the estimated values
> ### as true ones.
> bcv <- coverage(b, nsim = 10)
>
> ### bcv$coverageRB, bcv$coverageS, bcv$average.coverageRB, bcv$average.coverageS,
> ### bcv$minimum.coverageRB, bcv$raw.resultRB, bcv$raw.resultS
>
> ### bcv <- coverage(b, mean.PriorDist = 0.2, nsim = 100)
> ### bcv <- coverage(b, A.or.r = 50, nsim = 100)
> ### bcv <- coverage(b, reg.coef = c(-1.5, 0), nsim = 100)
> ### bcv <- coverage(b, A.or.r = 40, mean.PriorDist = 0.2, nsim = 100)
> ### bcv <- coverage(b, A.or.r = 40, reg.coef = c(-1.5, 0), nsim = 100)
>
> ################################################
> # If we know a mean of the prior distribution, #
> ################################################
>
> b <- gbp(z, n, mean.PriorDist = 0.265, model = "binomial")
>
> ### when we want to simulate pseudo datasets considering the estimated values
> ### as true ones.
> bcv <- coverage(b, nsim = 10)
>
> ### bcv$coverageRB, bcv$coverageS, bcv$average.coverageRB, bcv$average.coverageS,
> ### bcv$minimum.coverageRB, bcv$raw.resultRB, bcv$raw.resultS
>
> ### bcv <- coverage(b, mean.PriorDist = 0.2, nsim = 100)
> ### bcv <- coverage(b, A.or.r = 50, nsim = 100)
> ### bcv <- coverage(b, A.or.r = 40, mean.PriorDist = 0.2, nsim = 100)
>
> ###############################################################
> # Poisson Regression Interactive Multi-level Modeling (PRIMM) #
> ###############################################################
>
> ################################################
> # If we know a mean of the prior distribution, #
> ################################################
>
> p <- gbp(z, n, mean.PriorDist = 0.265, model = "poisson")
>
> ### when we want to simulate pseudo datasets considering the estimated values
> ### as true ones.
> pcv <- coverage(p, nsim = 10)
>
> ### pcv$coverageRB, pcv$coverageS, pcv$average.coverageRB, pcv$average.coverageS,
> ### pcv$minimum.coverageRB, pcv$raw.resultRB, pcv$raw.resultS
>
> ### pcv <- coverage(p, mean.PriorDist = 0.265, nsim = 100)
> ### pcv <- coverage(p, A.or.r = 150, nsim = 100)
> ### pcv <- coverage(p, A.or.r = 150, mean.PriorDist = 0.265, nsim = 100)
>
>
>
>
>
>
> dev.off()
null device
1
>