Fits linear models with hyperbolic errors. Can be used to carry out
linear regression for data exhibiting heavy tails and skewness.
Displays the histogram, log-histogram (both with fitted error
distribution), Q-Q plot and residuals vs. fitted values plot for the
fitted linear model.
Usage
hyperblm(formula, data, subset, weights, na.action,
xx = FALSE, y = FALSE, contrasts = NULL,
offset, method = "Nelder-Mead",
startMethod = "Nelder-Mead", startStarts = "BN",
paramStart = NULL,
maxiter = 100, tolerance = 0.0001,
controlBFGS = list(maxit = 1000),
controlNM = list(maxit = 10000),
maxitNLM = 10000,
controlCO = list(), silent = TRUE, ...)
## S3 method for class 'hyperblm'
print(x, digits = max(3, getOption("digits")-3), ...)
## S3 method for class 'hyperblm'
coef(object, ...)
## S3 method for class 'hyperblm'
plot(x, breaks = "FD",
plotTitles = c("Residuals vs Fitted Values",
"Histogram of residuals",
"Log-Histogram of residuals",
"Q-Q Plot"),
...)
Arguments
formula
an object of class "formula" (or one that
can be coerced to that class): a symbolic description of the
model to be fitted. The details of model specification are given
under ‘Details’.
data
an optional data frame, list or environment (or object
coercible by as.data.frame to a data frame) containing
the variables in the model. If not found in data, the
variables are taken from environment(formula),
typically the environment from which lm is called.
subset
an optional vector specifying a subset of observations
to be used in the fitting process.
weights
an optional vector of weights to be used in the fitting
process. Should be NULL or a numeric vector.
If non-NULL, weighted least squares is used with weights
weights (that is, minimizing sum(w*e^2)); otherwise
ordinary least squares is used. See also ‘Details’,
na.action
A function which indicates what should happen
when the data contain NAs. The default is set by
the na.action setting of options, and is
na.fail if that is unset. The ‘factory-fresh’
default is na.omit. Another possible value is
NULL, no action. Value na.exclude can be useful.
xx, y
Logicals. If TRUE, the corresponding components of the
fit (the explanatory matrix and the response vector) are returned.
contrasts
An optional list. See the contrasts.arg
of model.matrix.default.
offset
An optional vector. See Details.
method
Character. Possible values are "BFGS",
"Nelder-Mead" and "nlm". See Details.
startMethod
Character. Possible values are "BFGS" and
"Nelder-Mead". See Details.
startStarts
Character. Possible values are "BN",
"FN", "SL", "US" and "MoM". See
Details.
paramStart
An optional vector. A vector of parameter
start values for the optimization routine. See Details.
maxiter
Numeric. The maximum number of two-stage optimization
alternating iterations. See Details.
tolerance
Numeric. The two-stage optimization convergence
ratio. See Details.
controlBFGS, controlNM
Lists. Lists of control parameters for
optim when using corresponding (BFGS, Nelder-Mead) optimisation
method in first stage. See optim.
maxitNLM
Numeric. The maximum number of iterations for the NLM
optimizer.
controlCO
List. A list of control parameters for
constrOptim in second stage.
silent
Logical. If TRUE, the error messgae of optimizer will
not be displayed.
x
An object of class "hyperblm".
object
An object of class "hyperblm".
breaks
May be a vector, a single number or a character
string. See hist.
plotTitles
Titles to appear above the plots.
digits
Numeric. Desired number of digits when the object is
printed.
...
Passes additional arguments to function
hyperbFitStand, optim and
constrOptim.
Details
Models for hyperblm are specified symbolically. A typical
model has the form response ~ terms where response is
the (numeric) response vector and terms is a series of terms
which specifies a linear predictor for response. A terms
specification of the form first + second indicates all the
terms in first together with all the terms in second
with duplicates removed. A specification of the form
first:second indicates the set of terms obtained by taking the
interactions of all terms in first with all terms in
second. The specification first*second indicates the
cross of first and second. This is the same as
first + second + first:second.
If the formula includes an offset, this is evaluated and
subtracted from the response.
If response is a matrix a linear model is fitted separately by
least-squares to each column of the matrix.
See model.matrix for some further details. The terms in
the formula will be re-ordered so that main effects come first,
followed by the interactions, all second-order, all third-order and so
on.
A formula has an implied intercept term. To remove this use either
y ~ x - 1 or y ~ 0 + x. See formula for
more details of allowed formulae.
Non-NULLweights can be used to indicate that different
observations have different variances (with the values in
weights being inversely proportional to the variances); or
equivalently, when the elements of weights are positive
integers w_i, that each response y_i is the mean of
w_i unit-weight observations (including the case that there are
w_i observations equal to y_i and the data have been
summarized).
hyperblm calls the lower level function
hyperblmFit for the actual numerical computations.
All of weights, subset and offset are evaluated
in the same way as variables in formula, that is first in
data and then in the environment of formula.
hyperblmFit uses a two-stage alternating optimization
routine. The quality of parameter start values (especially the error
distribution parameters) is crucial to the routine's convergence. The
user can specify the start values via the paramStart argument,
otherwise the function finds reliable start values by calling the
hyperbFitStand function.
startMethod in the argument list is the optimization method for
function hyperbFitStandStart which finds the start
values for function hyperbFitStand. It is set to
"Nelder-Mead" by default due to the robustness of this
optimizer. The "BFGS" method is also implemented as it is
relatively fast to converge. Since "BFGS" method is a
quasi-Newton method it will not as robust and for some data will not
achieve convergence.
startStarts is the method used to find the start values for function
hyperbFitStandStart which includes:
"BN"A method from Barndorff-Nielsen (1977) based on
estimates of psi and gamma the absolute
slopes of the left and right asymptotes to the log density function
"FN"Based on a fitted normal distribution as it is a
limit of the hyperbolic distribution
"SL"Based on a fitted skew-Laplace distribution for
which the log density has the form of two straight line with
absolute slopes 1/alpha, 1/beta
"MoM"A method of moment approach
"US"User specified
method is the method used in stage one of the two-stage
alternating optimization routine. As the startMethod, it is set
to "Nelder-Mead" by default. Besides "BFGS","nlm"
is also implemented as a alternative. Since BFGS method is a
quasi-Newton method it will not as robust and for some data will not
achieve convergence.
If the maximum of the ratio the change of the individual coefficients
is smaller than tolerance then the routine assumes convergence,
otherwise if the alternating iteration number exceeds maxiter
with the maximum of the ratio the change of the individual
coefficients larger than tolerance, the routine is considered
not to have converged.
Value
hyperblm returns an object of class "hyperblm" which is a list
containing:
coefficients
A named vector of regression coefficients.
distributionParams
A named vector of fitted hyperbolic error
distribution parameters.
fitted.values
The fitted values from the model.
residuals
The remainder after subtracting fitted values from
response.
mle
The maximum likelihood value of the model.
method
The optimization method for stage one.
paramStart
The start values of parameters that the user
specified (only where relevant).
residsParamStart
The start values of parameters obtained by
hyperbFitStand (only where relevant).
call
The matched call.
terms
The terms object used.
contrasts
The contrasts used (only where relevant).
xlevels
The levels of the factors used in the fitting (only where
relevant).
offset
The offset used (only where relevant)
xNames
The names of each explanatory variables. If explanatory
variables don't have names then they will be named x.
yVec
The response vector.
xMatrix
The explanatory variables matrix.
iterations
Number of two-stage alternating iterations to
convergence.
convergence
The convergence code for two stage optimization: 0
is the system converged, 1 is first stage does not converge, 2 is
second stage does not converge, 3 is the both stages do not
converge.
Barndorff-Nielsen, O. (1977)
Exponentially decreasing distributions for the logarithm of particle size,
Proc. Roy. Soc. Lond.,
A353, 401–419.
Prause, K. (1999). The generalized hyperbolic models:
Estimation, financial derivatives and risk measurement.
PhD Thesis, Mathematics Faculty, University of Freiburg.
Trendall, Richard (2005). hypReg: A Function for Fitting a Linear
Regression Model in R with Hyperbolic Error.
Masters Thesis, Statistics Faculty, University of Auckland.
Paolella, Marc S. (2007). Intermediate Probability: A Computational
Approach. pp. 415 -Chichester: Wiley.
Scott, David J. and W<c3><bc>rtz, Diethelm and Chalabi, Yohan, (2011).
Fitting the Hyperbolic Distribution with R: A Case Study of
Optimization Techniques. In preparation.
Stryhn, H. and Christensen, J. (2003). Confidence intervals by
the profile likelihood method, with applications in veterinary
epidemiology.
ISVEE X.
See Also
print.hyperblm prints the regression result in a table.
coef.hyperblm obtains the regression coefficients and
error distribution parameters of the fitted model.
summary.hyperblm obtains a summary output of class
hyperblm object.
print.summary.hyperblm prints the summary output in a
table.
plot.hyperblm obtains a residual vs fitted value plot, a
histgram of residuals with error distribution density curve on top, a
histgram of log residuals with error distribution error density curve
on top and a QQ plot.
hyperblmFit, optim, nlm,
constrOptim, hist,
hyperbFitStand, hyperbFitStandStart.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(GeneralizedHyperbolic)
Loading required package: DistributionUtils
Loading required package: RUnit
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/GeneralizedHyperbolic/hyperblm.Rd_%03d_medium.png", width=480, height=480)
> ### Name: hyperblm
> ### Title: Fitting Linear Models with Hyperbolic Errors
> ### Aliases: hyperblm print.hyperblm coef.hyperblm plot.hyperblm
>
> ### ** Examples
>
> ## stackloss data example
>
> airflow <- stackloss[, 1]
> temperature <- stackloss[, 2]
> acid <- stackloss[, 3]
> stack <- stackloss[, 4]
>
> hyperblm.fit <- hyperblm(stack ~ airflow + temperature + acid,
+ tolerance = 1e-11)
>
> coef.hyperblm(hyperblm.fit)
$coefficients
[1] -36.9799824 0.8129461 0.7648236 -0.1246500
$distributionParams
[1] -1.0715709 0.1904949 0.5529651 0.1737802
> plot.hyperblm(hyperblm.fit, breaks = 20)
> summary.hyperblm(hyperblm.fit, hessian = FALSE)
Call:
hyperblm(formula = stack ~ airflow + temperature + acid, tolerance = 1e-11)
Data: (Intercept) airflow temperature acid
Parameter estimates:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -36.97998 13.91268 -2.6580 0.018734 *
airflow 0.81295 0.08635 9.4145 1.962e-07 ***
temperature 0.76482 0.22756 3.3610 0.004661 **
acid -0.12465 0.14274 -0.8733 0.397237
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Parameters for hyperbolic error distribution:
mu delta alpha beta
-1.0716 0.1905 0.5530 0.1738
Likelihood:
Method: Nelder-Mead
Convergence code: 0
Iterations: 6
>
>
>
>
>
> dev.off()
null device
1
>