R: Fit detection functions and calculate abundance from line or...
ds
R Documentation
Fit detection functions and calculate abundance from line or point transect data
Description
This function fits detection functions to line or point transect data and then (provided that survey information is supplied) calculates abundance and density estimates. The examples below illustrate some basic types of analysis using ds().
a data.frame containing at least a column called
distance. NOTE! If there is a column called size in
the data then it will be interpreted as group/cluster size, see the
section "Clusters/groups", below. One can supply data as a "flat file"
and not supply region.table, sample.table and
obs.table, see "Data format", below and flatfile.
truncation
either truncation distance (numeric, e.g. 5) or percentage (as a string, e.g. "15%"). Can be supplied as a list with elements left and right if left truncation is required (e.g. list(left=1,right=20) or list(left="1%",right="15%") or even list(left="1",right="15%")).
By default for exact distances the maximum observed distance is used as the right truncation. When the data is binned, the right truncation is the largest bin end point. Default left truncation is set to zero.
transect
indicates transect type "line" (default) or "point".
formula
formula for the scale parameter. For a CDS analysis leave this as its default ~1.
key
key function to use; "hn" gives half-normal (default), "hr" gives hazard-rate and "unif" gives uniform. Note that if uniform key is used, covariates cannot be included in the model.
adjustment
adjustment terms to use; "cos" gives cosine (default),
"herm" gives Hermite polynomial and "poly" gives simple polynomial.
"cos" is recommended. A value of NULL indicates that no
adjustments are to be fitted.
order
orders of the adjustment terms to fit (as a vector/scalar), the
default value (NULL) will select via AIC up to order 5. If a single number is given, that number is expanded to be seq(term_min, order, by=1) where term.min is the appropriate minimum order for this type of adjustment. For cosine
adjustments, valid orders are integers greater than 2 (except when a
uniform key is used, when the minimum order is 1). For Hermite
polynomials, even integers equal or greater than 2 are allowed and for
simple polynomials even integers equal or greater than 2 are allowedi (though note these will be multiplied by 2, see Buckland et al, 2001 for details on their specification). By default, AIC selection will try up to 5 adjustments, beyond that you must specify these manually, e.g. order=2:6 and perform your own AIC selection.
scale
the scale by which the distances in the adjustment terms are
divided. Defaults to "width", scaling by the truncation
distance. If the key is uniform only "width" will be used. The other
option is "scale": the scale parameter of the detection
cutpoints
if the data are binned, this vector gives the cutpoints of
the bins. Ensure that the first element is 0 (or the left truncation
distance) and the last is the distance to the end of the furthest bin.
(Default NULL, no binning.)
Note that if data has columns distbegin and
distend then these will be used as bins if cutpoints
is not specified. If both are specified, cutpoints has
precedence.
dht.group
should density abundance estimates consider all groups to be
size 1 (abundance of groups) dht.group=TRUE or should the
abundance of individuals (group size is taken into account),
dht.group=FALSE. Default is FALSE (abundance of
individuals is calculated).
monotonicity
should the detection function be constrained for monotonicity weakly ("weak"), strictly ("strict") or not at all ("none" or FALSE). See Montonicity, below. (Default "strict"). By default it is on for models without covariates in the detection function, off when covariates are present.
region.table
data.frame with two columns:
Region.Label
label for the region
Area
area of the region
region.table has one row for each stratum. If there is no
stratification then region.table has one entry with Area
corresponding to the total survey area.
sample.table
data.frame mapping the regions to the samples (
i.e. transects). There are three columns:
Sample.Label
label for the sample
Region.Label
label for the region that the
sample belongs to.
Effort
the effort expended in that sample
(e.g. transect length).
obs.table
data.frame mapping the individual observations
(objects) to regions and samples. There should be three columns:
object
Region.Label
label for the region that the
sample belongs to.
Sample.Label
label for the sample
convert.units
conversion between units for abundance estimation,
see "Units", below. (Defaults to 1, implying all of the units are
"correct" already.)
method
optimization method to use (any method usable by
optim or optimx). Defaults to
"nlminb".
quiet
surpress non-essential messages (useful for bootstraps etc).
Default value FALSE.
debug.level
print debugging output. 0=none, 1-3 increasing level of
debugging output.
initial.values
a list of named starting values, see
mrds-opt. Only allowed when AIC term selection is not used.
Value
a list with elements:
ddf
a detection function model object.
dht
abundance/density information (if survey
region data was supplied, else NULL).
Details
If abundance estimates are required then the data.frames region.table and sample.table must be supplied. If data does not contain the columns Region.Label and Sample.Label thenthe data.frameobs.table must also be supplied. Note that stratification only applies to abundance estimates and not at the detection function level.
Clusters/groups
Note that if the data contains a column named size and region.table, sample.table and obs.table are supplied, cluster size will be estimated and density/abundance will be based on a clustered analsis of the data. Setting this column to be NULL will perform a non-clustred analysis (for example if "size" means something else in your dataset).
Truncation
The right truncation point is by default set to be largest observed distance or bin end point. This is a default will not be appropriate for all data and can often be the cause of model convergence failures. It is recommended that one plots a histogram of the observed distances prior to model fitting so as to get a feel for an appropriate truncation distance. (Similar arguments go for left truncation, if appropriate). Buckland et al (2001) provide guidelines on truncation.
When specified as a percentage, the largest right and smallest left percent distances are discarded. Percentages cannot be supplied when using binned data.
Binning
Note that binning is performed such that bin 1 is all distances greater or equal to cutpoint 1 (>=0 or left truncation distance) and less than cutpoint 2. Bin 2 is then distances greater or equal to cutpoint 2 and less than cutpoint 3 and so on.
Monotonicity
When adjustment terms are used, it is possible for the detection function to not always decrease with increasing distance. This is unrealistic and can lead to bias. To avoid this, the detection function can be constrained for monotonicity (and is by default for detection functions without covariates).
Monotonicity constraints are supported in a similar way to that described in Buckland et al (2001). 20 equally spaced points over the range of the detection function (left to right truncation) are evaluated at each round of the optimisation and the function is constrained to be either always less than it's value at zero ("weak") or such that each value is less than or equal to the previous point (monotonically decreasing; "strict"). See also check.mono in mrds.
Even with no monotonicity constraints, checks are still made that the detection function is monotonic, see check.mono.
Units
In extrapolating to the entire survey region it is important that
the unit measurements be consistent or converted for consistency.
A conversion factor can be specified with the convert.units
variable. The values of Area in region.table, must be made
consistent with the units for Effort in sample.table and the
units of distance in the data.frame that was analyzed. It is
easiest if the units of Area are the square of the units of
Effort and then it is only necessary to convert the units of
distance to the units of Effort. For example, if Effort
was entered in kilometers and Area in square kilometers and
distance in meters then using convert.units=0.001 would
convert meters to kilometers, density would be expressed in square
kilometers which would then be consistent with units for Area.
However, they can all be in different units as long as the appropriate
composite value for convert.units is chosen. Abundance for a survey
region can be expressed as: A*N/a where A is Area for
the survey region, N is the abundance in the covered (sampled)
region, and a is the area of the sampled region and is in units of
Effort * distance. The sampled region a is multiplied by
convert.units, so it should be chosen such that the result is in
the same units as Area. For example, if Effort was entered
in kilometers, Area in hectares (100m x 100m) and distance
in meters, then using convert.units=10 will convert a to
units of hectares (100 to convert meters to 100 meters for distance and
.1 to convert km to 100m units).
Data format
One can supply data only to simply fit a detection function. However, if abundance/density estimates are necessary further information is required. Either the region.table, sample.table and obs.tabledata.frames can be supplied or all data can be supplied as a "flat file" in the data argument. In this format each row in data has additional information that would ordinarily be in the other tables. This usually means that there are additional columns named: Sample.Label, Region.Label, Effort and Area for each observation. See flatfile for an example.
Author(s)
David L. Miller
References
Buckland, S.T., Anderson, D.R., Burnham, K.P., Laake, J.L., Borchers, D.L., and Thomas, L. (2001). Distance Sampling. Oxford University Press. Oxford, UK.
Buckland, S.T., Anderson, D.R., Burnham, K.P., Laake, J.L., Borchers, D.L., and Thomas, L. (2004). Advanced Distance Sampling. Oxford University Press. Oxford, UK.
See Also
flatfile
Examples
# An example from mrds, the golf tee data.
library(Distance)
data(book.tee.data)
tee.data<-book.tee.data$book.tee.dataframe[book.tee.data$book.tee.dataframe$observer==1,]
ds.model <- ds(tee.data,4)
summary(ds.model)
plot(ds.model)
## Not run:
# same model, but calculating abundance
# need to supply the region, sample and observation tables
region <- book.tee.data$book.tee.region
samples <- book.tee.data$book.tee.samples
obs <- book.tee.data$book.tee.obs
ds.dht.model <- ds(tee.data,4,region.table=region,
sample.table=samples,obs.table=obs)
summary(ds.dht.model)
# specify order 2 cosine adjustments
ds.model.cos2 <- ds(tee.data,4,adjustment="cos",order=2)
summary(ds.model.cos2)
# specify order 2 and 3 cosine adjustments, turning monotonicity
# constraints off
ds.model.cos23 <- ds(tee.data,4,adjustment="cos",order=c(2,3),
monotonicity=FALSE)
# check for non-monotonicity -- actually no problems
check.mono(ds.model.cos23$ddf,plot=TRUE,n.pts=100)
# include both a covariate and adjustment terms in the model
ds.model.cos2.sex <- ds(tee.data,4,adjustment="cos",order=2,
monotonicity=FALSE, formula=~as.factor(sex))
# check for non-monotonicity -- actually no problems
check.mono(ds.model.cos2.sex$ddf,plot=TRUE,n.pts=100)
# truncate the largest 10% of the data and fit only a hazard-rate
# detection function
ds.model.hr.trunc <- ds(tee.data,truncation="10%",key="hr",adjustment=NULL)
summary(ds.model.hr.trunc)
## End(Not run)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(Distance)
Loading required package: mrds
This is mrds 2.1.14
Built: R 3.3.1; ; 2016-07-02 00:29:24 UTC; unix
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/Distance/ds.Rd_%03d_medium.png", width=480, height=480)
> ### Name: ds
> ### Title: Fit detection functions and calculate abundance from line or
> ### point transect data
> ### Aliases: ds
>
> ### ** Examples
>
> # An example from mrds, the golf tee data.
> library(Distance)
> data(book.tee.data)
> tee.data<-book.tee.data$book.tee.dataframe[book.tee.data$book.tee.dataframe$observer==1,]
> ds.model <- ds(tee.data,4)
Starting AIC adjustment term selection.
Fitting half-normal key function
Key only models do not require monotonicity contraints. Not constraining model for monotonicity.
AIC= 311.138
Fitting half-normal key function with cosine(2) adjustments
AIC= 313.124
half-normal key function selected!
No survey area information supplied, only estimating detection function.
> summary(ds.model)
Summary for distance analysis
Number of observations : 124
Distance range : 0 - 4
Model : Half-normal key function
AIC : 311.1385
Detection function parameters
Scale Coefficients:
estimate se
(Intercept) 0.6632435 0.09981249
Estimate SE CV
Average p 0.5842744 0.04637627 0.07937413
N in covered region 212.2290462 20.85130509 0.09824906
> plot(ds.model)
>
> ## Not run:
> ##D # same model, but calculating abundance
> ##D # need to supply the region, sample and observation tables
> ##D region <- book.tee.data$book.tee.region
> ##D samples <- book.tee.data$book.tee.samples
> ##D obs <- book.tee.data$book.tee.obs
> ##D
> ##D ds.dht.model <- ds(tee.data,4,region.table=region,
> ##D sample.table=samples,obs.table=obs)
> ##D summary(ds.dht.model)
> ##D
> ##D # specify order 2 cosine adjustments
> ##D ds.model.cos2 <- ds(tee.data,4,adjustment="cos",order=2)
> ##D summary(ds.model.cos2)
> ##D
> ##D # specify order 2 and 3 cosine adjustments, turning monotonicity
> ##D # constraints off
> ##D ds.model.cos23 <- ds(tee.data,4,adjustment="cos",order=c(2,3),
> ##D monotonicity=FALSE)
> ##D # check for non-monotonicity -- actually no problems
> ##D check.mono(ds.model.cos23$ddf,plot=TRUE,n.pts=100)
> ##D
> ##D # include both a covariate and adjustment terms in the model
> ##D ds.model.cos2.sex <- ds(tee.data,4,adjustment="cos",order=2,
> ##D monotonicity=FALSE, formula=~as.factor(sex))
> ##D # check for non-monotonicity -- actually no problems
> ##D check.mono(ds.model.cos2.sex$ddf,plot=TRUE,n.pts=100)
> ##D
> ##D # truncate the largest 10% of the data and fit only a hazard-rate
> ##D # detection function
> ##D ds.model.hr.trunc <- ds(tee.data,truncation="10%",key="hr",adjustment=NULL)
> ##D summary(ds.model.hr.trunc)
> ## End(Not run)
>
>
>
>
>
> dev.off()
null device
1
>