a list containing the matrices of the state space model.
xreg
optional matrix or list of external regressors. See details.
convergence
a numeric vector of length two to control and determine
the convergence of the filter. See details below.
t0
a numeric indicating the index of the first observation
at which the contributions to the likelihood are addep up.
return.all
logical. If TRUE, extended output containing elements
to be used by KS.deriv is returned.
Details
The implementation is a direct transcription of the iterative equations
of the filter that are summarized below. Details can be found in the references given
below and in many other textbooks.
The source code follows the notation used in Durbin and Koopman (2001).
The elements in the argument ss must be named in accordance with
the notation given below for the state space representation.
For those models defined in the package stsm,
a convenient way to create the argument ss is by means of the
function char2numeric.
The contributions to the likelihood function of the first observations
may be omitted by choosing a value of t0 greater than one.
The functions with ‘.deriv’ in the name compute the derivatives of some
of the elements involved in the filter with respect to the parameters of the model.
The functions KF and KF.deriv are fully implemented
in R while KF.deriv.C calls to compiled C code.
Using KF.deriv.C with return.all = FALSE returns minimal output with
the elements necessary to compute the derivative of the log-likelihood function.
Using return.all = TRUE further elements to be used in
KS.deriv are returned.
Missing observations are allowed. If a missing value is observed after the
filter has converged then all operations of the filter are run instead of using
steady state values until convergence is detected again.
Parameters to control the convergence of the filter.
In some models, the Kalman filter may converge to a steady state. Finding the
explicit expression of the steady state values can be cumbersome in some models.
Alternatively, at each iteration of the filter it can be checked whether a steady state
has been reached. For it, some control parameters can be defined in the argument convergence.
It is considered that convergence was reached when the following is observed:
the change in the variance of the prediction error over the last convergence[2]
consecutive iterations of the filter is below the tolerance value convergence[1].
When the steady state is reached, the values from the last iteration are used
in the remaining iteration of the filter.
Thus, the number of matrix operations can be reduced substantially as pointed
in Harvey (1989) Sec. 3.3.4.
External regressors.
A matrix of external regressors can be passed in argument xreg.
If xreg is a matrix then it is assumed that the time series
passed in argument y has been already adjusted for the effect
of these regressors, that is, y_t^{adj} = y_t - X γ.
If y is the observed series, then xreg should be a list containing
the following elements: xreg, the matrix of regressors; and
coefs, the vector of coefficients, γ, related to the regressors.
The coefficients must be placed in coefs in the same order
as the corresponding vectors are arranged by columns in xreg.
The number of rows of the matrix of regressors must be equal to the length
of the series y.
Column names are necessary for KF.deriv and
are optional for KF.deriv.C.
Value
A list containing the following elements:
v
prediction error.
f
variance of the prediction error.
K
Kalman gain.
L
auxiliar matrices to be used by the smoother.
a.pred
one step ahead prediction of the state vector.
P.pred
covariance matrix of a.pred.
a.upd
update of a.pred after the observation at time t
that is predicted in a.pred enters in the recursive filter.
P.upd
update of P.pred.
convit
the iteration at which the filter converged. If convergence
was not observed it is set to NULL.
mll
value of the negative of the log-likelihood function.
The function KF.C is a simplified and faster version that
records and returns only the value of negative of the log-likelihood function.
It is suited to be passed as argument to optim in the
stats package.
The functions that evaluate the derivatives include in the output the derivatem
terms: da.pred, dP.pred, da.upd, dP.upd, dv, df,
dvof (derivative of quotient v over f), dK and dL.
KF.deriv.C does not return a.upd and P.upd and their
derivative terms da.upd and dP.upd.
If return.all = TRUE, this function returns: dvof,
dL, da.pred, dP.pred, which are the derivative terms necessary
to evaluate the gradient of the log-likelihood function. By default they are not returned,
return.all = FALSE. They are in any case computed, the operations that are omitted
in the latter case is the arrangement of the output from the call to compiled C code into
matrices of the proper dimension containing the data in the right order.
Derivatives of the likelihood function are implemented in package stsm.
Although the Kalman filter provides information to evaluate the likelihood function,
it is not its primary objective. That's why the derivatives of the likelihood are
currently part of the package stsm, which is specific to likelihood methods
in structural time series models.
State space representation
The general univariate linear Gaussian state space model is defined
as follows:
y[t] = Za[t] + e[t], e[t] sim N(0, H)
a[t+1] = Ta[t] + Rw[t], w[t] sim N(0, V)
for t=1,…,n and a[1] sim N(a0, P0).
Z is a matrix of dimension 1xm;
H is 1x1;
T is mxm;
R is mxr;
V is rxr;
a0 is mx1 and
P0 is mxm,
where m is the dimension of the state vector a and
r is the number of variance parameters in the state vector.
The Kalman filtering recursions for the model above are:
Prediction
a[t] = T a[t-1]
P[t] = T P[t-1] T' + R V R'
v[t] = y[t] - Z a[t]
F[t] = Z P[t] Z' + H
Updating
K[t] = P[t] Z' F[t]^{-1}
a[t] = a[t] + K[t] v[t]
P[t] = P[t] - K[t] Z P[t]'
for t=2,…,n, starting with a[1] and P[1] equal
to a0 and P0. v[t] is the prediction error at observation
in time t and F[t] is the variance of v[t].
References
Durbin, J. and Koopman, S. J. (2001).
Time Series Analysis by State Space Methods.
Oxford University Press.
Harvey, A. C. (1989).
Forecasting, Structural Time Series Models and the Kalman Filter.
Cambridge University Press.
See Also
char2numeric in package stsm.
Examples
# local level plus seasonal model with arbitrary parameter values
# for the 'JohnsonJohnson' time series
m <- stsm::stsm.model(model = "llm+seas", y = JohnsonJohnson,
pars = c("var1" = 2, "var2" = 15, "var3" = 30))
ss <- stsm::char2numeric(m)
# run the Kalman filter
kf <- KF(m@y, ss)
plot(kf$a.upd[,1:2], main = "filtered state vector")
# 'KF.C' is a faster version that returns only the
# value of the negative of the likelihood function
kfc <- KF.C(m@y, ss)
all.equal(kf$mll, kfc)
# compute also derivative terms used below
kfd <- KF.deriv(m@y, ss)
all.equal(kfc, kfd$mll)
kfdc <- KF.deriv.C(m@y, ss, return.all = TRUE)
all.equal(kf$mll, kfdc$mll)
# as expected the versions that use compiled C code
# are faster that the versions written fully in R
# e.g. not including derivatives
## Not run:
system.time(for(i in seq_len(10)) kf <- KF(m@y, ss))
system.time(for(i in seq_len(10)) kfc <- KF.C(m@y, ss))
# e.g. including derivatives
system.time(for(i in seq_len(10)) kfd <- KF.deriv(m@y, ss))
system.time(for(i in seq_len(10)) kfdc <- KF.deriv.C(m@y, ss, return.all = TRUE))
## End(Not run)
# compare analytical and numerical derivatives
# they give same results up to a tolerance error
fcn <- function(x, model, type = c("v", "f"))
{
m <- stsm::set.pars(model, x)
ss <- stsm::char2numeric(m)
kf <- KF(m@y, ss)
switch(type, "v" = sum(kf$v), "f" = sum(kf$f))
}
dv <- numDeriv::grad(func = fcn, x = m@pars, model = m, type = "v")
all.equal(dv, colSums(kfd$dv), check.attributes = FALSE)
all.equal(dv, colSums(kfdc$dv), check.attributes = FALSE)
df <- numDeriv::grad(func = fcn, x = m@pars, model = m, type = "f")
all.equal(df, colSums(kfd$df), check.attributes = FALSE)
all.equal(df, colSums(kfdc$df), check.attributes = FALSE)
# compare timings in version written in R with numDeriv::grad
# no calls to compiled C code in either case
# looking at these timings, using analytical derivatives is
# expected to be useful in optimization algorithms
## Not run:
system.time(for (i in seq_len(10))
numdv <- numDeriv::grad(func = fcn, x = m@pars, model = m, type = "v"))
system.time(for(i in seq_len(10)) kfdv <- colSums(KF.deriv(m@y, ss)$dv))
## End(Not run)
# compare timings when convergence is not checked with the case
# when steady state values are used after convergence is observed
# computation time is reduced substantially
## Not run:
n <- length(m@y)
system.time(for(i in seq_len(20)) a <- KF.deriv(m@y, ss, convergence = c(0.001, n)))
system.time(for(i in seq_len(20)) b <- KF.deriv(m@y, ss, convergence = c(0.001, 10)))
# the results are the same up to a tolerance error
all.equal(colSums(a$dv), colSums(b$dv))
## End(Not run)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(KFKSDS)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/KFKSDS/KF.Rd_%03d_medium.png", width=480, height=480)
> ### Name: KF
> ### Title: Kalman Filter for State Space Models
> ### Aliases: KF KF.C KF.deriv KF.deriv.C
> ### Keywords: ts, model
>
> ### ** Examples
>
> # local level plus seasonal model with arbitrary parameter values
> # for the 'JohnsonJohnson' time series
> m <- stsm::stsm.model(model = "llm+seas", y = JohnsonJohnson,
+ pars = c("var1" = 2, "var2" = 15, "var3" = 30))
> ss <- stsm::char2numeric(m)
>
> # run the Kalman filter
> kf <- KF(m@y, ss)
> plot(kf$a.upd[,1:2], main = "filtered state vector")
>
> # 'KF.C' is a faster version that returns only the
> # value of the negative of the likelihood function
> kfc <- KF.C(m@y, ss)
> all.equal(kf$mll, kfc)
[1] TRUE
>
> # compute also derivative terms used below
> kfd <- KF.deriv(m@y, ss)
> all.equal(kfc, kfd$mll)
[1] TRUE
> kfdc <- KF.deriv.C(m@y, ss, return.all = TRUE)
> all.equal(kf$mll, kfdc$mll)
[1] TRUE
>
> # as expected the versions that use compiled C code
> # are faster that the versions written fully in R
> # e.g. not including derivatives
> ## Not run:
> ##D system.time(for(i in seq_len(10)) kf <- KF(m@y, ss))
> ##D system.time(for(i in seq_len(10)) kfc <- KF.C(m@y, ss))
> ##D # e.g. including derivatives
> ##D system.time(for(i in seq_len(10)) kfd <- KF.deriv(m@y, ss))
> ##D system.time(for(i in seq_len(10)) kfdc <- KF.deriv.C(m@y, ss, return.all = TRUE))
> ## End(Not run)
>
> # compare analytical and numerical derivatives
> # they give same results up to a tolerance error
> fcn <- function(x, model, type = c("v", "f"))
+ {
+ m <- stsm::set.pars(model, x)
+ ss <- stsm::char2numeric(m)
+ kf <- KF(m@y, ss)
+ switch(type, "v" = sum(kf$v), "f" = sum(kf$f))
+ }
>
> dv <- numDeriv::grad(func = fcn, x = m@pars, model = m, type = "v")
> all.equal(dv, colSums(kfd$dv), check.attributes = FALSE)
[1] TRUE
> all.equal(dv, colSums(kfdc$dv), check.attributes = FALSE)
[1] TRUE
> df <- numDeriv::grad(func = fcn, x = m@pars, model = m, type = "f")
> all.equal(df, colSums(kfd$df), check.attributes = FALSE)
[1] TRUE
> all.equal(df, colSums(kfdc$df), check.attributes = FALSE)
[1] TRUE
>
> # compare timings in version written in R with numDeriv::grad
> # no calls to compiled C code in either case
> # looking at these timings, using analytical derivatives is
> # expected to be useful in optimization algorithms
> ## Not run:
> ##D system.time(for (i in seq_len(10))
> ##D numdv <- numDeriv::grad(func = fcn, x = m@pars, model = m, type = "v"))
> ##D system.time(for(i in seq_len(10)) kfdv <- colSums(KF.deriv(m@y, ss)$dv))
> ## End(Not run)
>
> # compare timings when convergence is not checked with the case
> # when steady state values are used after convergence is observed
> # computation time is reduced substantially
> ## Not run:
> ##D n <- length(m@y)
> ##D system.time(for(i in seq_len(20)) a <- KF.deriv(m@y, ss, convergence = c(0.001, n)))
> ##D system.time(for(i in seq_len(20)) b <- KF.deriv(m@y, ss, convergence = c(0.001, 10)))
> ##D # the results are the same up to a tolerance error
> ##D all.equal(colSums(a$dv), colSums(b$dv))
> ## End(Not run)
>
>
>
>
>
> dev.off()
null device
1
>