R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Bayesian Geostatistical Model Fitting with RAMPS

georamps

R Documentation

Bayesian Geostatistical Model Fitting with RAMPS

Description

General function for fitting Bayesian geostatistical models using the reparameterized and marginalized posterior sampling (RAMPS) algorithm of Yan et al. (2007).

Usage

   georamps(fixed, random, correlation, data, subset, weights,
            variance = list(fixed = ~ 1, random = ~ 1, spatial = ~ 1),
            aggregate = list(grid = NULL, blockid = ""), kmat = NULL,
            control = ramps.control(...), contrasts = NULL, ...)

Arguments

`fixed`	two-sided linear `"formula"` object describing the main effects in the mean structure of the model, with the response on the left of a `~` operator and the terms, separated by `+` operators, on the right.
`random`	optional one-sided formula of the form `~ 1 \| g`, specifying random intercepts for groups defined by the factor `g`. Several grouping variables may be simultaneously specified, separated by the `` operator, as in `~ 1 \| g1 g2 * g3`. In such cases, the levels of each variable are pasted together and the resulting factor used to group the observations. Missing `NA` values may be given in the grouping variable to omit random effects for the associated measurements.
`correlation`	`'corRSpatial'` object describing the spatial correlation structure. See the `corRClasses` documentation for a listing of the available structures.
`data`	optional data frame containing the variables named in `fixed`, `random`, `correlation`, `weights`, `variance`, and `subset`.
`subset`	optional expression indicating the subset of rows in `data` that should be used in the fit. This can be a logical vector, or a numerical vector indicating which observation numbers are to be included, or a character vector of the row names to be included. All observations are included by default.
`weights`	optional numerical vector of measurement error variance (inverse) weights to be used in the fitting process. Defaults to a value of 1 for point-source measurements and the number of grid points for areal measurements (see the `aggregate` argument below).
`variance`	optional list of one-sided formulas, each of the form `~ g` where `g` defines a grouping factor for the following elements: `fixed` for measurement error variances; `random` for random effects error variances; and `spatial` for spatial variances. A single variance is assumed in each case by default.
`aggregate`	optional list of elements: `grid` a data frame of coordinates to use for Monte Carlo integration over geographic blocks at which areal measurements are available; and `blockid` a character string specifying the column by which to merge the areal measurements in `data` with the grid coordinates in `grid`. Merging is only performed for `blockid` values that are common to both datasets. All observations in `data` are treated as point-source measurements by default.
`kmat`	optional n s* design matrix for mapping spatial sites to outcome responses, where n is the number of responses and s the number of unique sites. Unique sites are ordered first according to those supplied to the `data` argument and second to those supplied to the `aggregate` argument. Defaults to `kmat[i,j] = 1 / N[i]` if site `j` is one of `N[i]` measurement sites contributing to response `i`; otherwise `kmat[i,j] = 0`. Rows or columns of zeros are not supported.
`control`	list of parameters for controlling the fitting process. See the `ramps.control` documentation for details.
`contrasts`	optional list. See the `contrasts.arg` of `model.matrix`.
`...`	further arguments passed to or from other methods.

Value

An object of class 'ramps' containing the following elements:

`params`	`'mcmc'` object of monitored model parameters with variable labels in the column names and MCMC iteration numbers in the row names.
`z`	`'mcmc'` object of monitored latent spatial parameters with variable labels in the column names and MCMC iteration numbers in the row names.
`loglik`	vector of data log-likelihood values at each MCMC iteration.
`evals`	vector of slice sampler evaluations at each MCMC iteration.
`call`	the matched function call to `georamps`.
`y`	response vector.
`xmat`	design matrix for the main effects.
`terms`	the `'terms'` object for `xmat`.
`xlevels`	list of the factor levels for `xmat`.
`etype`	grouping factor for the measurement error variances.
`weights`	weights used in the fitting process.
`kmat`	matrix for mapping the spatial parameters to the observed data.
`correlation`	specified `'corRSpatial'` object for the spatial correlation structure.
`coords`	matrix of unique coordinates for the measurement and grid sites.
`ztype`	grouping factor for the spatial variances.
`wmat`	matrix for mapping the random effects to the observed data.
`retype`	grouping factor for the random effects variances.
`control`	a list of control parameters used in the fitting process.

Author(s)

Brian Smith brian-j-smith@uiowa.edu, Jun Yan jun.yan@uconn.edu, and Kate Cowles kate-cowles@uiowa.edu

References

Yan, J., Cowles, M.K., Wang, S., and Armstrong, M. (2007) “Parallelizing MCMC for Bayesian Spatiotemporal Geostatistical Models”, Statistics and Computing, 17(4), 323-335.

Smith, B. J., Yan, J., and Cowles, M. K. (2008) “Unified Geostatistical Modeling for Data Fusion and Spatial Heteroskedasticity with R Package ramps”, Journal of Statistical Software, 25(10), 1-21.

Examples

## Not run: 
## Load the included uranium datasets for use in this example
data(NURE)

## Geostatistical analysis of areal measurements
NURE.ctrl1 <- ramps.control(
   iter = 25,
   beta = param(0, "flat"),
   sigma2.e = param(1, "invgamma", shape = 2.0, scale = 0.1, tuning = 0.75),
   phi = param(10, "uniform", min = 0, max = 35, tuning = 0.50),
   sigma2.z = param(1, "invgamma", shape = 2.0, scale = 0.1)
)

NURE.fit1 <- georamps(log(ppm) ~ 1,
   correlation = corRExp(form = ~ lon + lat, metric = "haversine"),
   weights = area,
   data = NURE,
   subset = (measurement == 1),
   aggregate = list(grid = NURE.grid, blockid = "id"),
   control = NURE.ctrl1
)
print(NURE.fit1)
summary(NURE.fit1)


## Analysis of point-source measurements
NURE.ctrl2 <- ramps.control(
   iter = 25,
   beta = param(0, "flat"),
   sigma2.e = param(1, "invgamma", shape = 2.0, scale = 0.1, tuning = 0.75),
   phi = param(10, "uniform", min = 0, max = 35, tuning = 0.5),
   sigma2.z = param(1, "invgamma", shape = 2.0, scale = 0.1)
)

NURE.fit2 <- georamps(log(ppm) ~ 1,
   correlation = corRExp(form = ~ lon + lat, metric = "haversine"),
   data = NURE,
   subset = (measurement == 2),
   control = NURE.ctrl2
)
print(NURE.fit2)
summary(NURE.fit2)


## Joint analysis of areal and point-source measurements with
## prediction only at grid sites
NURE.ctrl <- ramps.control(
   iter = 25,
   beta = param(rep(0, 2), "flat"),
   sigma2.e = param(rep(1, 2), "invgamma", shape = 2.0, scale = 0.1, tuning = 0.75),
   phi = param(10, "uniform", min = 0, max = 35, tuning = 0.5),
   sigma2.z = param(1, "invgamma", shape = 2.0, scale = 0.1),
   z.monitor = NURE.grid
)

NURE.fit <- georamps(log(ppm) ~ factor(measurement) - 1,
   correlation = corRExp(form = ~ lon + lat, metric = "haversine"),
   variance = list(fixed = ~ measurement),
   weights = area * (measurement == 1) + (measurement == 2),
   data = NURE,
   aggregate = list(grid = NURE.grid, blockid = "id"),
   control = NURE.ctrl
)
print(NURE.fit)
summary(NURE.fit)


## Discard initial 5 MCMC samples as a burn-in sequence
fit <- window(NURE.fit, iter = 6:25)
print(fit)
summary(fit)

## Deviance Information Criterion
DIC(fit)

## Prediction at unmeasured sites
ct <- map("state", "connecticut", plot = FALSE)
lon <- seq(min(ct$x, na.rm = TRUE), max(ct$x, na.rm = TRUE), length = 20)
lat <- seq(min(ct$y, na.rm = TRUE), max(ct$y, na.rm = TRUE), length = 15)
grid <- expand.grid(lon, lat)

newsites <- data.frame(lon = grid[,1], lat = grid[,2],
                       measurement = 1)
pred <- predict(fit, newsites)

plot(pred, func = function(x) exp(mean(x)),
     database = "state", regions = "connecticut",
     resolution = c(200, 150), bw = 5,
     main = "Posterior Mean",
     legend.args = list(text = "ppm", side = 3, line = 1))

plot(pred, func = function(x) exp(sd(x)),
     database = "state", regions = "connecticut",
     resolution = c(200, 150), bw = 5,
     main = "Posterior Standard Deviation",
     legend.args = list(text = "ppm", side = 3, line = 1))

## End(Not run)