R: Kernel Density Estimate and GPD Tail Extreme Value Mixture...
kdengpd
R Documentation
Kernel Density Estimate and GPD Tail Extreme Value Mixture Model
Description
Density, cumulative distribution function, quantile function and
random number generation for the extreme value mixture model with kernel density estimate for bulk
distribution upto the threshold and conditional GPD above threshold. The parameters
are the bandwidth lambda, threshold u
GPD scale sigmau and shape xi and tail fraction phiu.
kernel centres (typically sample data vector or scalar)
lambda
bandwidth for kernel (as half-width of kernel) or NULL
u
threshold
sigmau
scale parameter (positive)
xi
shape parameter
phiu
probability of being above threshold [0, 1] or TRUE
bw
bandwidth for kernel (as standard deviations of kernel) or NULL
kernel
kernel name (default = "gaussian")
log
logical, if TRUE then log density
q
quantiles
lower.tail
logical, if FALSE then upper tail probabilities
p
cumulative probabilities
n
sample size (positive integer)
Details
Extreme value mixture model combining kernel density estimate (KDE) for the bulk
below the threshold and GPD for upper tail.
The user can pre-specify phiu
permitting a parameterised value for the tail fraction φ_u. Alternatively, when
phiu=TRUE the tail fraction is estimated as the tail fraction from the
KDE bulk model.
The alternate bandwidth definitions are discussed in the
kernels, with the lambda as the default.
The bw specification is the same as used in the
density function.
The possible kernels are also defined in kernels
with the "gaussian" as the default choice.
The cumulative distribution function with tail fraction φ_u defined by the
upper tail fraction of the kernel density estimate (phiu=TRUE), upto the
threshold x ≤ u, given by:
F(x) = H(x)
and above the threshold x > u:
F(x) = H(u) + [1 - H(u)] G(x)
where H(x) and G(X) are the KDE and conditional GPD
cumulative distribution functions respectively.
The cumulative distribution function for pre-specified φ_u, upto the
threshold x ≤ u, is given by:
F(x) = (1 - φ_u) H(x)/H(u)
and above the threshold x > u:
F(x) = φ_u + [1 - φ_u] G(x)
Notice that these definitions are equivalent when φ_u = 1 - H(u).
If no bandwidth is provided lambda=NULL and bw=NULL then the normal
reference rule is used, using the bw.nrd0 function, which is
consistent with the density function. At least two kernel
centres must be provided as the variance needs to be estimated.
See gpd for details of GPD upper tail component and
dkden for details of KDE bulk component.
Value
dkdengpd gives the density,
pkdengpd gives the cumulative distribution function,
qkdengpd gives the quantile function and
rkdengpd gives a random sample.
Acknowledgments
Based on code
by Anna MacDonald produced for MATLAB.
Note
Unlike most of the other extreme value mixture model functions the
kdengpd functions have not been vectorised as
this is not appropriate. The main inputs (x, p or q)
must be either a scalar or a vector, which also define the output length.
The kerncentres can also be a scalar or vector.
The kernel centres kerncentres can either be a single datapoint or a vector
of data. The kernel centres (kerncentres) and locations to evaluate density (x)
and cumulative distribution function (q) would usually be different.
Default values are provided for all inputs, except for the fundamentals
kerncentres, x, q and p. The default sample size for
rkdengpd is 1.
Missing (NA) and Not-a-Number (NaN) values in x,
p and q are passed through as is and infinite values are set to
NA. None of these are not permitted for the parameters or kernel centres.
Due to symmetry, the lower tail can be described by GPD by negating the quantiles.
Error checking of the inputs (e.g. invalid probabilities) is carried out and
will either stop or give warning message as appropriate.
Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value
threshold estimation and uncertainty quantification. REVSTAT - Statistical
Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf
Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of
density estimates. Biometrika 71(2), 353-360.
Duin, R.P.W. (1976). On the choice of smoothing parameters for Parzen estimators of
probability density functions. IEEE Transactions on Computers C25(11), 1175-1179.
MacDonald, A., Scarrott, C.J., Lee, D., Darlow, B., Reale, M. and Russell, G. (2011).
A flexible extreme value mixture model. Computational Statistics and Data Analysis
55(6), 2137-2157.
Wand, M. and Jones, M.C. (1995). Kernel Smoothing. Chapman && Hall.
See Also
kernels, kfun,
density, bw.nrd0
and dkde in ks package.