R: Bayesian variable selection for linear models via non-local...
modelSelection
R Documentation
Bayesian variable selection for linear models via non-local priors.
Description
Bayesian model selection for linear models using
non-local priors. The
algorithm uses a Gibbs scheme and can handle p>>n cases.
See rnlp to obtain posterior samples for the coefficients.
If center==TRUE, y and x are centered to have zero mean, therefore eliminating the need to include an intercept term in x
scale
If scale==TRUE, y and columns in x are scaled to have standard deviation 1
niter
Number of Gibbs sampling iterations
thinning
MCMC thinning factor, i.e. only one out of each thinning iterations are reported. Defaults to thinning=1, i.e. no thinning
burnin
Number of burn-in MCMC iterations. Defaults to
.1*niter. Set to 0 for no burn-in
family
Residual distribution. Possible values are
'normal','twopiecenormal','laplace', 'twopiecelaplace', or 'auto'. For
the latter the residual distribution is inferred from the data.
priorCoef
Prior distribution for the coefficients. Object created
with momprior, imomprior, emomprior or zellnerprior.
priorDelta
Prior on model space. Use modelbbprior() for Beta-Binomial prior, modelbinomprior(p) for Binomial prior with prior inclusion probability p, or modelunifprior() for Uniform prior
priorVar
Inverse gamma prior on residual variance, created with
igprior()
priorSkew
prior on residual skewness parameter, assumed to be of
the same family as priorCoef. Ignored if family is 'normal' or 'laplace'.
phi
Residual variance. Typically this is unknown and therefore left missing. If specified argument priorVar is ignored
deltaini
Logical vector of length ncol(x) indicating which coefficients should be initialized to be non-zero. Defaults to all variables being excluded from the model
initSearch
Algorithm to refine deltaini. initSearch=='greedy' uses a greedy Gibbs sampling search. initSearch=='SCAD' sets deltaini to the non-zero elements in a SCAD fit with cross-validated regularization parameter. initSearch=='none' leaves deltaini unmodified
method
Method to compute marginal likelihood. The default is to
used closed-form expressions whenever possible (only available for
pMOM priors with up to 15 covariates) and Laplace approximations otherwise.
method=='Laplace' for Laplace approx, method=='plugin'
for BIC-type approximation,
method=='MC' for Importance Sampling, method=='Hybrid'
for Hybrid Laplace-IS (only available for piMOM prior). See Details.
optimMethod
Algorithm to maximize objective function when method=='Laplace' or method=='MC'. Only used for family=='twopiecenormal'. optimMethod=='LMA' uses modified Newton-Raphson algorithm, 'CDA' coordinate descent algorithm
B
Number of samples to use in Importance Sampling scheme. Ignored if method=='Laplace'
verbose
Set verbose==TRUE to print iteration progress
Details
Let delta be the vector indicating inclusion/exclusion of each
column of x in the model. The Gibbs algorithm sequentially samples from the
posterior of each element in delta conditional on all the remaining
elements in delta and the data.
To do this it is necessary to evaluate the marginal likelihood for any
given model. These have closed-form expression for the MOM prior, but
for models with >15 variables these are expensive to compute and
Laplace approximations are used instead (for the residual variance a
log change of variables is used, which improves the approximation).
For other priors closed forms
are not available, so by default Laplace approximations are used.
For the iMOM prior we also implement a Hybrid Laplace-IS
which uses a Laplace approximation to evaluate the integral wrt beta
and integrates wrt phi (residual variance) numerically.
It should be noted that Laplace approximations tend to under-estimate
the marginal densities when the MLE for some parameter is very close
to 0. That is, it tends to be conservative in the sense of excluding
more variables from the model than an exact calculation would.
Finally, method=='plugin' provides a BIC-type approximation that is
faster than exact or Laplace methods, at the expense of some
accuracy. In non-sparse situations where models with many variables
have large posterior probability method=='plugin' can be substantially
faster.
For more details on the methods used to compute marginal densities see
Johnson & Rossell (2012).
Value
Object of class msfit, which extends a list with elements
postSample
matrix with posterior samples for the model
indicator. postSample[i,j]==1
indicates that variable j was included in the model in the MCMC
iteration i
postOther
postOther
returns posterior samples for parameters other than the model
indicator, i.e. basically hyper-parameters.
If hyper-parameters were fixed in the model specification, postOther will be empty.
margpp
Marginal posterior probability for inclusion of each
covariate. This is computed by averaging marginal post prob for
inclusion in each Gibbs iteration, which is much more accurate than
simply taking colMeans(postSample)
.
postMode
Model with highest posterior probability amongst all those visited
postModeProb
Unnormalized posterior prob of posterior mode (log scale)
postProb
Unnormalized posterior prob of each visited model (log
scale)
coef
Estimated coefficients (via posterior mode) for highest
posterior probability model
Author(s)
David Rossell
References
Johnson V.E., Rossell D. Non-Local Prior Densities for Default
Bayesian Hypothesis Tests. Journal of the Royal Statistical Society B,
2010, 72, 143-170.
Johnson V.E., Rossell D. Bayesian model selection in high-dimensional
settings. Journal of the American Statistical Association, 2012, 107, 649-660.
See Also
See msfit-class for details on the output, postProb to
extract posterior model probabilities.
See rnlp to obtain posterior samples for the coefficients.
To compute marginal densities for a given model see
pmomMarginalK, pmomMarginalU,
pimomMarginalK, pimomMarginalU.
Examples
#Simulate data
x <- matrix(rnorm(100*3),nrow=100,ncol=3)
theta <- matrix(c(1,1,0),ncol=1)
y <- x %*% theta + rnorm(100)
#Specify prior parameters
priorCoef <- imomprior(tau=1)
priorDelta <- modelunifprior()
priorVar <- igprior(alpha=.01,lambda=.01)
#Alternative prior for model space: 0.5 prior prob for including any covariate
priorDelta <- modelbinomprior(p=0.5)
#Alternative prior for model space: Beta hyper-prior for prob of inclusion
priorDelta <- modelbbprior(alpha.p=1,beta.p=1)
#Model selection
fit1 <- modelSelection(y=y, x=x, center=FALSE, scale=FALSE, niter=10^2,
priorCoef=priorCoef, priorDelta=priorDelta, priorVar=priorVar, phi=1,
method='Laplace')
fit1$postMode
fit1$margpp
postProb(fit1)