A collection and description of simple to
use functions to model univariate autoregressive
moving average time series processes, including
time series simulation, parameter estimation,
diagnostic analysis of the fit, and predictions
of future values.
The functions are:
armaSim
Simulates an artificial ARMA time series process,
armaFit
Fits the parameters of an ARMA time series process,
print
Print Method,
plot
Plot Method,
summary
Summary Method,
predict
Forecasts and optionally plots an ARMA process,
fitted
Method, returns fitted values,
coef|coefficients
Method, returns coefficients,
residuals
Method, returns residuals.
Usage
armaSim(model = list(ar = c(0.5, -0.5), d = 0, ma = 0.1), n = 100,
innov = NULL, n.start = 100, start.innov = NULL,
rand.gen = rnorm, rseed = NULL, addControl = FALSE, ...)
armaFit(formula, data, method = c("mle", "ols"), include.mean = TRUE,
fixed = NULL, title = NULL, description = NULL, ...)
## S4 method for signature 'fARMA'
show(object)
## S3 method for class 'fARMA'
plot(x, which = "ask", gof.lag = 10, ...)
## S3 method for class 'fARMA'
summary(object, doplot = TRUE, which = "all", ...)
## S3 method for class 'fARMA'
predict(object, n.ahead = 10, n.back = 50, conf = c(80, 95),
doplot = TRUE, ...)
## S3 method for class 'fARMA'
fitted(object, ...)
## S3 method for class 'fARMA'
coef(object, ...)
## S3 method for class 'fARMA'
residuals(object, ...)
Arguments
addControl
[armaSim] -
a logical value. Should control parameters added to the returned
series as a control attribute?
data
an optional timeSeries or data frame object containing the variables
in the model. If not found in data, the variables are taken
from environment(formula), typically the environment from which
armaFit is called. If data is an univariate series, then
the series is converted into a numeric vector and the name of the
response in the formula will be neglected.
description
a character string which allows for a brief description.
doplot
[armaRoots] -
a logical. Should a plot be displayed?
[predict][summary] -
is used by the predict and summary methods. By default,
this value is set to TRUE and thus the function calls generate
beside written also graphical printout.
Additional arguments required by underlying functions have
to be passed through the dots argument.
fixed
[armaFit] -
is an optional numeric vector of the same length as the total
number of parameters. If supplied, only NA entries in
fixed will be varied. In this way subset ARMA processes
can be modeled. ARIMA modelling supports this option. Thus
for estimating parameters of subset ARMA and AR models the
most easiest way is to specify them by the formulas
x~ARIMA(p, 0, q) and x~ARIMA(p, 0, 0), respectively.
formula
[armaFit] -
a formula specifying the general structure of the ARMA form.
Can have one of the forms
x ~ ar(q),
x ~ ma(p),
x ~ arma(p, q),
x ~ arima(p, d, q), or
x ~ arfima(p, q).
x is the response variable optionally to appear in the
formula expression.
In the first case R's function ar from the ts
package will be used to estimate the parameters, in the second
case R's function arma from the tseries package
will be used, in the third case R's function arima from
the ts package will be used, and in the last case R's
function fracdiff from the fracdiff package will
be used. The state space modelling based arima function
allows also to fit ARMA models using arima(p, d=0, q), and
AR models using arima(q, d=0, q=0), or pure MA models
using arima(q=0, d=0, p). (Exogenous variables are also
allowed and can be passed through the ... argument.)
gof.lag
[print][plot][summary][predict] -
the maximum number of lags for a goodness-of-fit test.
include.mean
[armaFit] -
Should the ARIMA model include a mean term? The default is
TRUE, note that for differenced series a mean would
not affect the fit nor predictions.
innov
[armaSim] -
is a univariate time series or vector of innovations to produce
the series. If not provided, innov will be generated using
the random number generator specified by rand.gen.
Missing values are not allowed. By default the normal
random number generator will be used.
method
[armaFit] -
a character string denoting the method used to fit the model.
The default method for all models is the log-likelihood parameter
estimation approach, method="mle". In the case of an AR
model the parameter estimation can also be done by ordinary least
square estimation, "ols".
model
[armaSim] -
a list with one (AR), two (ARMA) or three (ARIMA, FRACDIFF)
elements . ar is a numeric vector giving the AR coefficients,
d is an integer value giving the degree of differencing,
and ma is a numeric vector giving the MA coefficients.
Thus the order of the time series process is (F)ARIMA(p, d, q)
with p=length(ar) and q=length(ma). d is
a positive integer for ARIMA models and a numeric value for
FRACDIFF models. By default an ARIMA(2, 0, 1) model with
coefficients ar=c(0.5, -0.5) and ma=0.1 will be
generated.
n
[armaSim] -
an integer value setting the length of the series to be simulated
(optional if innov is provided). The default value is 100.
n.ahead, n.back, conf
[print][plot][summary][predict] -
are presetted arguments for the predict method. n.ahead
determines how far ahead forecasts should be evaluated together
with errors on the confidence intervals given by the argument
conf. If a forecast plot is desired, which is the
default and expressed by doplot=TRUE, then n.back
sets the number of time steps back displayed in the graph.
n.start
[armaSim] -
gives the number of start-up values discarded when simulating
non-stationary models. The start-up innovations will be generated
by rand.gen if start.innov is not provided.
object
[summary][predict] -
is an object of class fARMA returned by the fitting function
armaFit and serves as input for the summary, and
predict methods. Some methods allow for additional
arguments.
rand.gen
[armaSim] -
is the function which is called to generate the innovations.
Usually, rand.gen will be a random number generator.
Additional arguments required by the random number generator
rand.gen, usually the location, scale and/or shape
parameter of the underlying distribution function, have to be
passed through the dots argument.
rseed
[armaSim] -
the random number seed, by default NULL. If this argument is
set to an integervalue, then the function set.seed(rseed)
will be called.
start.innov
[armaSim] -
is a univariate time series or vector of innovations to be used
as start up values. Missing values are not allowed.
title
a character string which allows for a project title.
which
[plot][summary] -
if which is set to "ask" the function will
interactively ask which plot should be displayed. This is
the default value for the plot method. If
which="all" is specified all plots will be displayed.
This is the default setting for the summary method.
On the other hand, if a vector of logicals is specified,
then those plots will be displayed for which the elements
of the vector are set to TRUE.
x
[print][plot] -
is an object of class fARMA returned by the fitting
function armaFit and serves as input for the predict,
print, print.summary, and plot methods.
Some methods allow for additional arguments.
...
additional arguments to be passed to the output timeSeries.
(charvec, units, ...)
Details
AR - Auto-Regressive Modelling:
The argument x~ar(p) calls the underlying functions
ar.mle or ar.ols depending on the
method's choice.
For definiteness, the AR models are defined through
code{(x[t] - m) = a[1]*(x[t-1] - m) + … + a[p]*(x[t-p] - m) + e[t]}
Order selection can be achieved through the comparison of AIC
values for different model specifications. However this may be
problematic, as of the methods here only ar.mle performs
true maximum likelihood estimation. The AIC is computed as if
the variance estimate were the MLE, omitting the determinant
term from the likelihood. Note that this is not the same as the
Gaussian likelihood evaluated at the estimated parameter values.
With method="yw" the variance matrix of the innovations is
computed from the fitted coefficients and the autocovariance of
x. Burg's method allows for two alternatives
method="burg1" or method="burg2" to estimate the
innovations variance and hence AIC. Method 1 is to use the update
given by the Levinson-Durbin recursion (Brockwell and Davis, 1991),
and follows S-PLUS. Method 2 is the mean of the sum of squares of
the forward and backward prediction errors (as in Brockwell and Davis,
1996). Percival and Walden (1998) discuss both.
[stats:ar]
MA - Moving-Average Modelling:
The argument x~ma(q) maps the call to the
argument x ~ arima(0, 0, q).
ARMA - Auto-Regressive Moving-Average Modelling:
The argument x~arma(p,q) maps the call to the
argument x~arima(p, 0, q).
ARIMA - Integrated ARMA Modelling:
The argument x~arima() calls the underlying function
arima from R's ts package. For definiteness, the AR
models are defined through
and so the MA coefficients differ in sign from those of
S-PLUS. Further, if include.mean is TRUE, this formula
applies to x-m rather than x. For ARIMA models with
differencing, the differenced series follows a zero-mean ARMA model.
The variance matrix of the estimates is found from the Hessian of
the log-likelihood, and so may only be a rough guide.
Optimization is done by optim. It will work
best if the columns in xreg are roughly scaled to zero mean
and unit variance, but does attempt to estimate suitable scalings.
The exact likelihood is computed via a state-space representation
of the ARIMA process, and the innovations and their variance found
by a Kalman filter. The initialization of the differenced ARMA
process uses stationarity. For a differenced process the
non-stationary components are given a diffuse prior (controlled
by kappa). Observations which are still controlled by the
diffuse prior (determined by having a Kalman gain of at least
1e4) are excluded from the likelihood calculations. (This
gives comparable results to arima0 in the absence
of missing values, when the observations excluded are precisely those
dropped by the differencing.)
Missing values are allowed, and are handled exactly in method "ML".
If transform.pars is true, the optimization is done using an
alternative parametrization which is a variation on that suggested by
Jones (1980) and ensures that the model is stationary. For an AR(p)
model the parametrization is via the inverse tanh of the partial
autocorrelations: the same procedure is applied (separately) to the
AR and seasonal AR terms. The MA terms are not constrained to be
invertible during optimization, but they will be converted to
invertible form after optimization if transform.pars is true.
Conditional sum-of-squares is provided mainly for expositional
purposes. This computes the sum of squares of the fitted innovations
from observation n.cond on, (where n.cond is at least
the maximum lag of an AR term), treating all earlier innovations to
be zero. Argument n.cond can be used to allow comparability
between different fits. The “part log-likelihood” is the first
term, half the log of the estimated mean square. Missing values
are allowed, but will cause many of the innovations to be missing.
When regressors are specified, they are orthogonalized prior to
fitting unless any of the coefficients is fixed. It can be helpful to
roughly scale the regressors to zero mean and unit variance.
Note from arima: The functions parse their arguments to the
original time series functions available in R's time series library
ts.
The results are likely to be different from S-PLUS's
arima.mle, which computes a conditional likelihood and does
not include a mean in the model. Further, the convention used by
arima.mle reverses the signs of the MA coefficients.
[stats:arima]
ARFIMA/FRACDIFF Modelling:
The argument x~arfima() calls the underlying functions from
R's fracdiff package. The estimator calculates the maximum
likelihood estimators of the parameters of a fractionally-differenced
ARIMA (p,d,q) model, together (if possible) with their estimated
covariance and correlation matrices and standard errors, as well
as the value of the maximized likelihood. The likelihood is
approximated using the fast and accurate method of Haslett and
Raftery (1989). Note, the number of AR and MA coefficients should
not be too large (say < 10) to avoid degeneracy in the model.
The optimization is carried out in two levels: an outer univariate
unimodal optimization in d over the interval [0,.5], and an inner
nonlinear least-squares optimization in the AR and MA parameters to
minimize white noise variance.
[fracdiff:fracdiff]
Value
armaFit
returns an S4 object of class "fARMA", with the following
slots:
call
the matched function call.
data
the input data in form of a data.frame.
description
allows for a brief project description.
fit
the results as a list returned from the underlying
time series model function.
method
the selected time series model naming the applied method.
formula
the formula expression describing the model.
parameters
named parameters or coefficients of the fitted model.
title
a title string.
Note
There is nothing really new in this package. The benefit you will
get with this collection is, that all functions have a common
argument list with a formula to specify the model and presetted
arguments for the specification of the algorithmic method. For
users who have already modeled GARCH processes with R/Rmetrics and
SPlus/Finmetrics, this approach will be quite natural.
The function armaFit allows for the following formula arguments:
x ~ ar()
autoregressive time series processes,
x ~ ma()
moving average time series processes,
x ~ arma()
autoregressive moving average processes,
x ~ arima()
autoregressive integrated moving average processes, and
x ~ arfima()
fractionally integrated ARMA processes.
For the first selection x~ar() the function armaFit()
uses the AR modelling algorithm as implemented in R's stats
package.
For the second x~ma(), third x~arma(), and fourth
selection x~arima() the function armaFit() uses the
ARMA modelling algorithm also as implemented in R's stats
package.
For the last selection x~arfima() the function armaFit()
uses the fractional ARIMA modelling algorithm from R's contributed
fracdiff package.
Note, that the AR, MA, and ARMA processes can all be modelled by the
same algorithm specifying the formula x~arima(p,d,q) in the
proper way, i.e. setting d=0 and choosing the orders of p
and q as zero in agreement with the desired model specification.
Alternatively, one can still use the functions from R's "stats"
package: arima.sim that simulates from an ARIMA time series
model, ar, arima, arima0 that fit an AR, ARIMA model to an
univariate time series, predict that forecasts from a fitted
model, and tsdiag that plots time-series diagnostics.
No function from these packages is masked, modified or overwritten.
The output of the print, summary, and predict
methods have all the same style of format for each time series
model with some additional algorithm specific printing. This makes
it easier to interpret the results obtained from different algorithms
implemented in different functions.
For arfima models the following methods are not yet
implemented: plot, fitted, residuals,
predict, and predictPlot.
Author(s)
M. Plummer and B.D. Ripley for ar functions and code,
B.D. Ripley for arima and ARMAacf functions and code,
C. Fraley and F. Leisch for fracdiff functions and code, and
Diethelm Wuertz for the Rmetrics R-port.
References
Brockwell, P.J. and Davis, R.A. (1996);
Introduction to Time Series and Forecasting,
Second Edition, Springer, New York.
Durbin, J. and Koopman, S.J. (2001);
Time Series Analysis by State Space Methods,
Oxford University Press.
Gardner, G, Harvey, A.C., Phillips, G.D.A. (1980);
Algorithm AS154. An algorithm for exact maximum likelihood
estimation of autoregressive-moving average models by means of
Kalman filtering,
Applied Statistics, 29, 311–322.
Hannan E.J. and Rissanen J. (1982);
Recursive Estimation of Mixed Autoregressive-Moving
Average Order.
Biometrika 69, 81–94.
Harvey, A.C. (1993);
Time Series Models,
2nd Edition, Harvester Wheatsheaf, Sections 3.3 and 4.4.
Jones, R.H. (1980);
Maximum likelihood fitting of ARMA models to time
series with missing observations,
Technometrics, 20, 389–395.
Percival, D.P. and Walden, A.T. (1998);
Spectral Analysis for Physical Applications.
Cambridge University Press.
Whittle, P. (1963);
On the fitting of multivariate autoregressions
and the approximate canonical factorization of a spectral
matrix.
Biometrika 40, 129–134.
Haslett J. and Raftery A.E. (1989);
Space-time Modelling with Long-memory Dependence: Assessing
Ireland's Wind Power Resource (with Discussion),
Applied Statistics 38, 1–50.