Last data update: 2014.03.03

R: Bayesian Adaptive Sampling Without Replacement for Variable...
bas.lmR Documentation

Bayesian Adaptive Sampling Without Replacement for Variable Selection in Linear Models

Description

Sample without replacement from a posterior distribution on models

Usage

bas.lm(formula, data, weights=NULL, n.models=NULL,  prior="ZS-null", alpha=NULL,
 modelprior=beta.binomial(1,1),
 initprobs="Uniform", method="BAS", update=NULL,
 bestmodel = NULL, prob.local = 0.0, prob.rw=0.5,
 MCMC.iterations = NULL, lambda = NULL, delta = 0.025,thin=1)

Arguments

formula

linear model formula for the full model with all predictors, Y ~ X. All code assumes that an intercept will be included in each model and that the X's will be centered.

data

data frame

weights

an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, Bayes estimates are obtained assuming that Y ~ N(Xb, sigma^2 diag(1/weights)).

n.models

number of models to sample either without replacement (method="BAS" or "MCMC+BAS") or with replacement (method="MCMC"). If NULL, BAS with method="BAS" will try to enumerate all 2^p models. If enumeration is not possible (memory or time) then a value should be supplied which controls the number of sampled models using 'n.models'. With method="MCMC", sampling will stop once at the min(n.models, MCMC.iterations) so MCMC.iterations be larger than n.models in order to explore the model space. On exit for method= "MCMC" this is the number of unique models that have been sampled with counts stored in the output as "freq""

prior

prior distribution for regression coefficients. Choices include "AIC", "BIC", "g-prior", "ZS-null", "ZS-full", "hyper-g", "hyper-g-laplace", "hyper-g-n", "EB-local", and "EB-global"

alpha

optional hyperparameter in g-prior or hyper g-prior. For Zellner's g-prior, alpha = g, for the Liang et al hyper-g or hyper-g-n method, recommended choice is alpha are between (2 < alpha < 4), with alpha = 3 recommended.

modelprior

Family of prior distribution on the models. Choices include uniform Bernoulli or beta.binomial with the default being a beta.binomial(1,1).

initprobs

Vector of length p or a character string for approach used to create the vector. This is used to order variables for sampling all methods for potentially more efficient storage while sampling and provides the initial inclusion probabilities used for sampling without replacement with method="BAS". Options for the charactier string giving the method are: "Uniform" or "uniform" where each predictor variable is equally likely to be sampled (equivalent to random sampling without replacement); "eplogp" uses the eplogprob function to aproximate the Bayes factor from p-values from the full model to find initial marginal inclusion probabilitites; "marg-eplogp" useseplogprob.marg function to aproximate the Bayes factor from p-values from the full model each simple linear regression. To run a Markov Chain to provide initial estimates of marginal inclusion probabilities for "BAS", use method="MCMC+BAS" below. While the initprobs are not used in sampling for method="MCMC", this determines the order of the variables in the lookup table and affects memory allocation in large problems where enumeration is not feasible. For variables that should always be included set the corresponding initprobs to 1, i.e. the intercept should be included with probability one.

method

A character variable indicating which sampling method to use: method="BAS" uses Bayesian Adaptive Sampling (without replacement) using the sampling probabilities given in initprobs; method="MCMC" samples with replacement via a MCMC algorithm based that combines the birth/death random walk in Hoeting et al (1997) of MC3 with a random swap move to interchange a variable in the model with one currently excluded as in Clyde, Ghosh and Littman (2010); method="MCMC+BAS" runs an initial MCMC to calculate marginal inclusion probabilities and then samples without replacement as in BAS. For both BAS and AMCMC, the sampling probabilities can be updated as more models are sampled. (see update below). We recommend "MCMC+BAS" or "MCMC" for high dimensional problems where enumeration is not feasible.

update

number of iterations between potential updates of the sampling probabilities for method "BAS". If NULL do not update, otherwise the algorithm will update using the marginal inclusion probabilities as they change while sampling takes place. For large model spaces, updating is recommended. If the model space will be enumerated, leave at the default.

bestmodel

optional binary vector representing a model to initialize the sampling. If NULL sampling starts with the null model

prob.local

A future option to allow sampling of models "near" the median probability model. Not used at this time.

prob.rw

For any of the MCMC methods, probability of using the random-walk proposal; otherwise use a random "flip" move to propose a new model.

MCMC.iterations

Number of iterations for the MCMC sampler; the default is n.models*10 if not set by the user.

lambda

Parameter in the AMCMC algorithm.

delta

truncation parameter to prevent sampling probabilities to degenerate to 0 or 1 prior to enumeration for sampling withour replacent.

thin

For "MCMC", thin the MCMC every "thin" iterations; default is no thinning.

Details

BAS provides several search algorithms to find high probability models for use in Bayesian Model Averaging or Bayesian model selection. For p less than 20-25, BAS can enumerate all models depending on memory availability, for larger p, BAS samples without replacement using random or deterministic sampling. The Bayesian Adaptive Sampling algorithm of Clyde, Ghosh, Littman (2010) samples models without replacement using the initial sampling probabilities, and will optionally update the sampling probabilities every "update" models using the estimated marginal inclusion probabilties. BAS uses different methods to obtain the initprobs, which may impact the results in high-dimensional problems. The deterinistic sampler provides a list of the top models in order of an approximation of independence using the provided initprobs. This may be effective after running the other algorithms to identify high probability models and works well if the correlations of variables are small to modest. The priors on coefficients include Zellner's g-prior, the Hyper-g prior (Liang et al 2008, the Zellner-Siow Cauchy prior, Empirical Bayes (local and gobal) g-priors. AIC and BIC are also included.

Value

bas returns an object of class BMA

An object of class BAS is a list containing at least the following components:

postprob

the posterior probabilities of the models selected

priorprobs

the prior probabilities of the models selected

namesx

the names of the variables

R2

R2 values for the models

logmarg

values of the log of the marginal likelihood for the models

n.vars

total number of independent variables in the full model, including the intercept

size

the number of independent variables in each of the models, includes the intercept

which

a list of lists with one list per model with variables that are included in the model

probne0

the posterior probability that each variable is non-zero computed using the renormalized marginal likelihoods of sampled models. This may be biased if the number of sampled models is much smaller than the total number of models. Unbiased estimates may be obtained using method "MCMC".

mle

list of lists with one list per model giving the MLE (OLS) estimate of each (nonzero) coefficient for each model. The intercept is the mean of Y as each column of X has been centered by subtracting its mean.

mle.se

list of lists with one list per model giving the MLE (OLS) standard error of each coefficient for each model

prior

the name of the prior that created the BMA object

alpha

value of hyperparameter in prior used to create the BMA object.

modelprior

the prior distribution on models that created the BMA object

Y

response

X

matrix of predictors

mean.x

vector of means for each column of X (used in predict.bas)

The function summary.bas, is used to print a summary of the results. The function plot.bas is used to plot posterior distributions for the coefficients and image.bas provides an image of the distribution over models. Posterior summaries of coefficients can be extracted using coefficients.bma. Fitted values and predictions can be obtained using the S3 functions fitted.bas and predict.bas. BA objects may be updated to use a different prior (without rerunning the sampler) using the function update.bma.

Author(s)

Merlise Clyde (clyde@stat.duke.edu) and Michael Littman

References

Clyde, M. Ghosh, J. and Littman, M. (2010) Bayesian Adaptive Sampling for Variable Selection and Model Averaging. Journal of Computational Graphics and Statistics. 20:80-101
http://dx.doi.org/10.1198/jcgs.2010.09049

Clyde, M. and George, E. I. (2004) Model Uncertainty. Statist. Sci., 19, 81-94.
http://dx.doi.org/10.1214/088342304000000035

Clyde, M. (1999) Bayesian Model Averaging and Model Search Strategies (with discussion). In Bayesian Statistics 6. J.M. Bernardo, A.P. Dawid, J.O. Berger, and A.F.M. Smith eds. Oxford University Press, pages 157-185.

Hoeting, J. A., Madigan, D., Raftery, A. E. and Volinsky, C. T. (1999) Bayesian model averaging: a tutorial (with discussion). Statist. Sci., 14, 382-401.
http://www.stat.washington.edu/www/research/online/hoeting1999.pdf

Liang, F., Paulo, R., Molina, G., Clyde, M. and Berger, J.O. (2008) Mixtures of g-priors for Bayesian Variable Selection. Journal of the American Statistical Association. 103:410-423.
http://dx.doi.org/10.1198/016214507000001337

Zellner, A. (1986) On assessing prior distributions and Bayesian regression analysis with g-prior distributions. In Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti, pp. 233-243. North-Holland/Elsevier.

Zellner, A. and Siow, A. (1980) Posterior odds ratios for selected regression hypotheses. In Bayesian Statistics: Proceedings of the First International Meeting held in Valencia (Spain), pp. 585-603.

See Also

summary.bas, coefficients.bma, print.bas, predict.bas, fitted.bas plot.bas, image.bas, eplogprob, update.bma

Examples

demo(BAS.hald)
## Not run: demo(BAS.USCrime) 

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(BAS)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/BAS/bas.Rd_%03d_medium.png", width=480, height=480)
> ### Name: bas.lm
> ### Title: Bayesian Adaptive Sampling Without Replacement for Variable
> ###   Selection in Linear Models
> ### Aliases: bas bas.lm
> ### Keywords: regression
> 
> ### ** Examples
> 
> demo(BAS.hald)


	demo(BAS.hald)
	---- ~~~~~~~~

> data(Hald)

> hald.gprior =  bas.lm(Y~ ., data=Hald, prior="g-prior", alpha=13,
+                       modelprior=beta.binomial(1,1),
+                       initprobs="eplogp")

> hald.gprior

Call:
bas.lm(formula = Y ~ ., data = Hald, prior = "g-prior", alpha = 13,     modelprior = beta.binomial(1, 1), initprobs = "eplogp")


 Marginal Posterior Inclusion Probabilities: 
Intercept         X1         X2         X3         X4  
   1.0000     0.9019     0.6896     0.4653     0.6329  

> plot(hald.gprior)

> summary(hald.gprior)
     Intercept X1 X2 X3 X4         BF PostProbs     R2 dim   logmarg
[1,]         1  1  1  0  0 1.00000000    0.2432 0.9787   3 11.727354
[2,]         1  1  0  0  1 0.69239442    0.1684 0.9725   3 11.359755
[3,]         1  1  1  1  1 0.08991408    0.1312 0.9824   5  9.318453
[4,]         1  1  1  0  1 0.33557136    0.1224 0.9823   4 10.635434
[5,]         1  1  1  1  0 0.33449263    0.1220 0.9823   4 10.632214

> image(hald.gprior, subset=-1, vlas=0)

> hald.coef = coefficients(hald.gprior)

> hald.coef

 Marginal Posterior Summaries of Coefficients: 
           post mean  post SD  post p(B != 0)
Intercept  95.4231     0.7107   1.0000       
X1          1.2150     0.5190   0.9019       
X2          0.2756     0.4832   0.6896       
X3         -0.1271     0.4976   0.4653       
X4         -0.3269     0.4717   0.6329       

> plot(hald.coef)

> predict(hald.gprior, hald.gprior$X[,-1], top=5)  # do not include the intercept in the design matrix
$Ybma
           [,1]
 [1,]  79.74246
 [2,]  74.50010
 [3,] 105.29268
 [4,]  89.88693
 [5,]  95.57177
 [6,] 104.56409
 [7,] 103.40145
 [8,]  77.13668
 [9,]  91.99731
[10,] 114.21325
[11,]  82.78446
[12,] 111.00723
[13,] 110.40160

$Ypred
         [,1]     [,2]     [,3]     [,4]     [,5]     [,6]     [,7]     [,8]
[1,] 81.17036 74.83464 105.0725 89.69881 97.15898 104.4575 103.3893 76.06454
[2,] 77.70296 74.24113 105.8554 90.46267 93.09565 104.7152 103.1399 78.80193
[3,] 79.70437 74.40553 105.2175 89.76253 95.63309 104.5709 103.5254 77.08557
[4,] 79.65151 74.47846 105.4218 89.83174 95.62799 104.5962 103.5068 77.00839
[5,] 79.84321 74.31409 104.9063 89.65651 95.70301 104.5285 103.5476 77.15919
         [,9]    [,10]    [,11]    [,12]    [,13]
[1,] 91.57174 113.1722 81.59906 111.2219 111.0884
[2,] 92.68123 115.8058 84.50293 110.4162 109.0791
[3,] 91.98604 114.1759 82.78145 111.1196 110.5321
[4,] 92.07571 114.1088 82.68233 111.0429 110.4674
[5,] 91.83513 114.2353 82.88128 111.2384 110.6515

$postprobs
[1] 0.3089304 0.2139017 0.1666632 0.1555023 0.1550024

$best
[1]  7 12 10  5  3


> fitted(hald.gprior, type="HPM")
 [1]  81.17036  74.83464 105.07248  89.69881  97.15898 104.45753 103.38927
 [8]  76.06454  91.57174 113.17222  81.59906 111.22195 111.08841

> hald.gprior =  bas.lm(Y~ ., data=Hald, n.models=2^4,
+                       prior="g-prior", alpha=13, modelprior=uniform(),
+                       initprobs="eplogp")

> hald.EB = update(hald.gprior, newprior="EB-global")

> hald.bic = update(hald.gprior,newprior="BIC")

> hald.zs = update(hald.bic, newprior="ZS-null")
> ## Not run: demo(BAS.USCrime) 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>