Last data update: 2014.03.03

R: Calibrate an ensemble Bayesian Model Averaging model
calibrateEnsembleR Documentation

Calibrate an ensemble Bayesian Model Averaging model

Description

This function calibrates an EBMA model based on out-of-sample performance in the calibration time period. Given a dependent variable and calibration-sample predictions from multiple component forecast models in the ForecastData the calibrateEnsemble function fits an ensemble BMA mixture model. The weights assigned to each model are derived from the individual model's performance in the calibration period. Missing observations are allowed in the calibration period, however models with missing observations are penalized. When missing observations are prevalent in the calibration set, the EM algorithm is adjusted and model paprameters are estimated by maximizing a renormalized partial expected complete-data log-likelihood (Fraley et al. 2010).

Usage

calibrateEnsemble(.forecastData = new("ForecastData"), exp = 1,
  tol = sqrt(.Machine$double.eps), maxIter = 1e+06, model = "logit",
  method = "EM", ...)

fitEnsemble(.forecastData, tol = sqrt(.Machine$double.eps), maxIter = 1e+06,
  method = "EM", exp = 1, useModelParams = TRUE,
  predType = "posteriorMedian", const = 0, W = c(), ...)

## S4 method for signature 'ForecastDataNormal'
fitEnsemble(.forecastData,
  tol = sqrt(.Machine$double.eps), maxIter = 1e+06, method = "EM",
  exp = numeric(), useModelParams = TRUE, predType = "posteriorMedian",
  const = 0, W = c())

Arguments

.forecastData

An object of class 'ForecastData' that will be used to calibrate the model.

exp

The exponential shrinkage term. Forecasts are raised to the (1/exp) power on the logit scale for the purposes of bias reduction. The default value is exp=3.

tol

Tolerance for improvements in the log-likelihood before the EM algorithm will stop optimization. The default is tol= 0.01, which is somewhat high. Researchers may wish to reduce this by an order of magnitude for final model estimation.

maxIter

The maximum number of iterations the EM algorithm will run before stopping automatically. The default is maxIter=10000.

model

The model type that should be used given the type of data that is being predicted (i.e., normal, binary, etc.).

method

The estimation method used. Currently only implements "EM".

...

Not implemented

useModelParams

If "TRUE" individual model predictions are transformed based on logit models. If "FALSE" all models' parameters will be set to 0 and 1.

predType

The prediction type used for the EBMA model under the normal model, user can choose either posteriorMedian or posteriorMean. Posterior median is the default.

const

user provided "wisdom of crowds" parameter, serves as minimum model weight for all models. Default = 0

W

Vector of initial model weights, if unspecified each model will receive weight 1/number of Models

Value

Returns a data of class 'FDatFitLogit' or FDatFitNormal, a subclass of 'ForecastData', with the following slots

predCalibration

A matrix containing the predictions of all component models and the EBMA model for all observations in the calibration period.

predTest

A matrix containing the predictions of all component models and the EBMA model for all observations in the test period.

outcomeCalibration

A vector containing the true values of the dependent variable for all observations in the calibration period.

outcomeTest

An optional vector containing the true values of the dependent variable for all observations in the test period.

modelNames

A character vector containing the names of all component models. If no model names are specified, names will be assigned automatically.

modelWeights

A vector containing the model weights assigned to each model.

modelParams

The parameters for the individual logit models that transform the component models.

useModelParams

Indicator whether model parameters for transformation were estimated or not.

logLik

The final log-likelihood for the calibrated EBMA model.

exp

The exponential shrinkage term.

tol

Tolerance for improvements in the log-likelihood before the EM algorithm will stop optimization.

maxIter

The maximum number of iterations the EM algorithm will run before stopping automatically.

method

The estimation method used.

iter

Number of iterations run in the EM algorithm.

call

The actual call used to create the object.

Author(s)

Michael D. Ward <michael.d.ward@duke.edu> and Jacob M. Montgomery <jacob.montgomery@wustl.edu> and Florian M. Hollenbach <florian.hollenbach@tamu.edu>

References

Montgomery, Jacob M., Florian M. Hollenbach and Michael D. Ward. (2015). Calibrating ensemble forecasting models with sparse data in the social sciences. International Journal of Forecasting. In Press.

Montgomery, Jacob M., Florian M. Hollenbach and Michael D. Ward. (2012). Improving Predictions Using Ensemble Bayesian Model Averaging. Political Analysis. 20: 271-291.

Raftery, A. E., T. Gneiting, F. Balabdaoui and M. Polakowski. (2005). Using Bayesian Model Averaging to calibrate forecast ensembles. Monthly Weather Review. 133:1155–1174.

Sloughter, J. M., A. E. Raftery, T. Gneiting and C. Fraley. (2007). Probabilistic quantitative precipitation forecasting using Bayesian model averaging. Monthly Weather Review. 135:3209–3220.

Fraley, C., A. E. Raftery, T. Gneiting. (2010). Calibrating Multi-Model Forecast Ensembles with Exchangeable and Missing Members using Bayesian Model Averaging. Monthly Weather Review. 138:190–202.

Sloughter, J. M., T. Gneiting and A. E. Raftery. (2010). Probabilistic wind speed forecasting using ensembles and Bayesian model averaging. Journal of the American Statistical Association. 105:25–35.

Fraley, C., A. E. Raftery, and T. Gneiting. (2010). Calibrating multimodel forecast ensembles with exchangeable and missing members using Bayesian model averaging. Monthly Weather Review. 138:190–202.

Examples

## Not run: data(calibrationSample)

data(testSample) 

this.ForecastData <- makeForecastData(.predCalibration=calibrationSample[,c("LMER", "SAE", "GLM")],
.outcomeCalibration=calibrationSample[,"Insurgency"],.predTest=testSample[,c("LMER", "SAE", "GLM")],
.outcomeTest=testSample[,"Insurgency"], .modelNames=c("LMER", "SAE", "GLM"))

this.ensemble <- calibrateEnsemble(this.ForecastData, model="logit", tol=0.001, exp=3)

## End(Not run)

Results