Last data update: 2014.03.03

R: Fit a Generalized Dissimilarity Model to Tabular Site-Pair...
gdmR Documentation

Fit a Generalized Dissimilarity Model to Tabular Site-Pair Data

Description

For an overview of the functions in the gdm package have a look here: gdm-package.

gdm is used to fit a generalized dissimilarity model to tabular site-pair data formatted as follows using the formatsitepair function: Response, Weights, X0, Y0, X1, Y1, Pred1SiteA, Pred2SiteA, ..., PredNSiteA, Pred1SiteB, Pred2SiteB, ..., PredNSiteB. The first column (Response) must be any ratio based dissimilarity (distance) measure between Site A and Site B. The second column defines any weighting to be applied during fitting of the model. If equal weighting is required, then all entries in this column should be set to 1.0 (default). The third and fourth columns, X0 and Y0, represent the spatial coordinates of the first site in the site pair (Site A). The fifth and sixth columns, X1 and Y1, represent the coordinates of the second site (Site B). Note that the first six columns are REQUIRED, even if you do not intend to use geographic distance as a predictor (in which case these columns can be loaded with dummy data if the actual coordinates are unknown). The next N*2 columns contain values for N predictors for Site A, followed by values for the same N predictors for Site B.

The following is an example of a GDM input table header:

Response, Weights, X0, Y0, X1, Y1, S1_Temp, S1_Rain, S1_Bedrock, S2_Temp, S2_Rain, S2_Bedrock

Usage

gdm(data, geo=FALSE, splines=NULL, knots=NULL)

Arguments

data

A data frame containing the site pairs to be used to fit the GDM (typically obtained using the formatsitepair function). The observed response data must be located in the first column. The weights to be applied to each site pair must be located in the second column. If geo is TRUE, then the X0, Y0 and Y0, Y1 columns will be used to calculate the geographic distance between site pairs for inclusion as the geographic predictor term in the model. If geo is FALSE (default), then the X0, Y0, X1 and Y1 data columns must still be included, but are ignored in fitting the model. Columns containing the predictor data for Site A, and the predictor data for Site B, follow.

geo

Set to TRUE if geographic distance between sites is to be included as a model term. Set to FALSE if geographic distance is to be omitted from the model. Default is FALSE.

splines

An optional vector of the number of I-spline basis functions to be used for each predictor in fitting the model. If supplied, it must have the same length as the number of predictors (including geographic distance if geo is TRUE). If this vector is not provided (splines=NULL), then a default of 3 basis functions is used for all predictors.

knots

An optional vector of knots in units of the predictor variables to be used in the fitting process. If knots are supplied and splines=NULL, then the knots argument must have the same length as the number of predictors * 3. If both knots and the number of splines are supplied, then the length of the knots argument must be the same as the sum of the values in the splines vector. Note that the default values for knots when the default three I-spline basis functions are 0 (minimum), 50 (median), and 100 (maximum) quantiles.

Value

gdm returns a gdm model object. The function summary.gdm can be used to obtain or print a synopsis of the results. A gdm model object is a list containing at least the following components:

dataname

The name of the table used as the data argument to the model.

geo

Whether geographic distance was used as a predictor in the model.

gdmdeviance

The deviance of the fitted GDM model.

nulldeviance

The deviance of the null model.

explained

The percentage of null deviance explained by the fitted GDM model.

intercept

The fitted value for the intercept term in the model.

predictors

A list of the names of the predictors that were used to fit the model.

coefficients

A list of the coefficients for each spline for each of the predictors considered in model fitting.

knots

A vector of the knots derived from the x data (or user defined), for each predictor.

splines

A vector of the number of I-spline basis functions used for each predictor.

creationdate

The date and time of model creation.

observed

The observed response for each site pair (from data column 1).

predicted

The predicted response for each site pair, from the fitted model (after applying the link function).

ecological

The linear predictor (ecological distance) for each site pair, from the fitted model (before applying the link function).

References

Ferrier S, Manion G, Elith J, Richardson, K (2007) Using generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional biodiversity assessment. Diversity & Distributions 13, 252-264.

See Also

formatsitepair, summary.gdm, plot.gdm, predict.gdm, gdm.transform

Examples

##fit table environmental data
##sets up site-pair table, environmental tabular data
load(system.file("./data/gdm.RData", package="gdm"))
sppData <- gdmExpData[c(1,2,13,14)]
envTab <- gdmExpData[c(2:ncol(gdmExpData))]
sitePairTab <- formatsitepair(sppData, 2, XColumn="Long", YColumn="Lat", sppColumn="species", 
	siteColumn="site", predData=envTab)

##fit table GDM
gdmTabMod <- gdm(sitePairTab, geo=TRUE)
summary(gdmTabMod)

##fit raster environmental data
##sets up site-pair table
rastFile <- system.file("./extdata/stackedVars.grd", package="gdm")
envRast <- stack(rastFile)

##environmental raster data
sitePairRast <- formatsitepair(sppData, 2, XColumn="Long", YColumn="Lat", sppColumn="species",
	siteColumn="site", predData=envRast)
##sometimes raster data returns NA in the site-pair table, these rows will have to be removed 
##before fitting gdm
sitePairRast <- na.omit(sitePairRast)

##fit raster GDM
gdmRastMod <- gdm(sitePairRast, geo=TRUE)
summary(gdmRastMod)

Results