Last data update: 2014.03.03

R: Estimation of limited dependent variable models
mhurdleR Documentation

Estimation of limited dependent variable models

Description

mhurdle fits a large set of models relevant when the dependent variable is 0 for a part of the sample.

Usage

mhurdle(formula, data, subset, weights, na.action,
     start = NULL,
     dist = c("ln","tn","n", "bc", "ihs"),
     corr = NULL, ...)
## S3 method for class 'mhurdle'
coef(object,
   which = c("all", "h1", "h2", "h3", "sd", "corr", "tr"), ...)
## S3 method for class 'mhurdle'
vcov(object,
   which = c("all", "h1", "h2", "h3", "sd", "corr", "tr"), ...)
## S3 method for class 'mhurdle'
logLik(object, naive = FALSE, ...)
## S3 method for class 'mhurdle'
print(x, digits = max(3, getOption("digits") - 2),
                     width = getOption("width"), ...)
## S3 method for class 'mhurdle'
summary(object, ...)
## S3 method for class 'summary.mhurdle'
print(x, digits = max(3, getOption("digits") - 2),
   width = getOption("width"), ...)

## S3 method for class 'mhurdle'
fitted(object,
   which = c("all", "zero", "positive"), ...)
## S3 method for class 'mhurdle'
predict(object, newdata = NULL, ...)
## S3 method for class 'mhurdle'
update(object, new, ...)

Arguments

formula

a symbolic description of the model to be fitted,

data

a data.frame,

newdata

a data.frame for which the predictions should be computed,

subset

see lm,

weights

see lm,

na.action

see lm,

start

starting values,

dist

the distribution of the error of the consumption equation: one of "n" (normal), "l" (log-normal) or "t" (truncated normal),

corr

indicates whether the errors of the different equations are correlated. For models with two equations, this can be either "d" for dependent and "i" for independent. For models with three equations, this should be a character of length three containing values of "i" and "d",

naive

a boolean, it TRUE, the likelihood of the naive model is returned,

object,x

an object of class "mhurdle",

new

an updated formula for the update method,

digits

see print,

width

see print,

which

which coefficients or covariances should be extracted ? Those of the selection ("h1"), consumption ("h2") or purchase ("h3") equation, the other coefficients "other" (the standard error and the coefficient of corr), the standard error ("sigma") or the coefficient of correlation ("rho"),

...

further arguments.

Details

mhurdle fits models for which the dependent variable is zero for a part of the sample. Null values of the dependent variable may occurs because of one or several mechanisms : good rejection, lack of ressources and purchase infrequency. The model is described using a three-parts formula : the first part describes the selection process if any, the second part the regression equation and the third part the purchase infrequency process. y ~ 0 | x1 + x2 | z1 + z2 means that there is no selection process. y ~ w1 + w2 | x1 + x2 | 0 and y ~ w1 + w2 | x1 + x2 describe the same model with no purchase infrequency process. The second part is mandatory, it explains the positive values of the dependant variable. The dist argument indicates the distribution of the error term. If dist = "n", the error term is normal and (at least part of) the zero observations are also explained by the second part as the result of a corner solution. Several models described in the litterature are obtained as special cases :

A model with a formula like y~0|x1+x2 and dist="n" is the Tobit model proposed by Tobin (1958).

y~w1+w2|x1+x2 and dist="l" or dist="t" is the single hurdle model proposed by Cragg (1971). With dist="n", the double hurdle model also proposed by Cragg (1971) is obtained. With corr="h1" we get the correlated version of this model described by Blundell (1987).

y~0|x1+x2|z1+z2 is the P-Tobit model of Deaton and Irish (1984), which can be a single hurdle model if dist="t" or dist="l" or a double hurdle model if dist="n".

Value

an object of class c("mhurdle", "maxLik").

A "mhurdle" object has the following elements :

coefficients

the vector of coefficients,

vcov

the covariance matrix of the coefficients,

fitted.values

a matrix of fitted.values, the first column being the probability of 0 and the second one the mean values for the positive observations,

logLik

the log-likelihood,

gradient

the gradient at convergence,

model

a data.frame containing the variables used for the estimation,

coef.names

a list containing the names of the coefficients in the selection equation, the regression equation, the infrequency of purchase equation and the other coefficients (the standard deviation of the error term and the coefficient of correlation if corr = TRUE),

formula

the model formula, an object of class Formula,

call

the call,

rho

the lagrange multiplier test of no correlation.

References

Blundell R, Meghir C (1987). Bivariate Alternatives to the Tobit Model. Journal of Econometrics, 34, 179-200.

Cragg JG (1971). Some Statistical Models for Limited Dependent Variables with Applications for the Demand for Durable Goods. Econometrica, 39(5), 829-44.

Deaton A, Irish M (1984). A Statistical Model for Zero Expenditures in Household Budgets. Journal of Public Economics, 23, 59-80.

Tobin J (1958). Estimation of Relationships for Limited Dependent Variables. Econometrica, 26(1), 24-36.

Examples

data("tobin", package = "survival")
#  tobit model
model010 <- mhurdle(durable ~ 0 | age + quant | 0, tobin, dist = "n")
#  independent double hurdle model
model110i <- mhurdle(durable ~ age |  quant | 0, tobin,  dist = "n")
# Cragg log-normal single hurdle model
model100il <- mhurdle(durable ~ age |  quant | 0, tobin, dist = "ln")
# Cragg truncated-normal single hurdle model
model100it <- update(model100il, dist = "tn")
# a double-hurdle p-tobit
model011i <- mhurdle(durable ~ 0 |  quant | age, tobin,  dist = "n")

Results