Computes the value of the kernel estimator of the distribution function, in a
single value or in a grid. Four possibilites for the kernel function are
implemented, and the bandwidth parameter can be directly calculated by the
plug-in method of Polansky and Baker (2000).
The kernel function. You can use
four types: "e" Epanechnikov, "n" Normal, "b" Biweight and
"t" Triweight. The Normal kernel is used by default.
vec_data
The data sample.
y
The single value or the grid vector where the distribution function
is estimated. By default, a grid of 100 equidistant points from the minimum
to the maximum of the data sample is selected.
bw
The bandwidth used. If it is not provided, the Plug-in bandwidth
of Polansky and Baker (2000) is computed.
Value
Returns a list containing:
Estimated_values
Vector containing the estimated function in the grid values.
Reiss, R.D. (1981) Nonparametric estimation of smooth distribution functions,
Scandinavian Journal of Statistics8, pp:116-119.
Simonoff, J. (1996) Smoothing Methods in Statistics, Springer, New York.
Polansky, A.M. and Baker, E.R. (2000) Multistage plug-in bandwidth selection
for kernel distribution function estimates, Journal of Statistical
Computation and Simulation65, pp. 63-80.
Quintela-del-Rio, A. and Estevez-Perez, G. (2012)
Nonparametric Kernel Distribution Function Estimation with kerdiest:
An R Package for Bandwidth Choice and Applications,
Journal of Statistical Software50(8), pp. 1-21.
URL http://www.jstatsoft.org/v50/i08/.
Examples
# Comparison of three bandwidth selection methods
x<-rnorm(100)
# The bandwidths by cross-validation, plug-in of Altman and Leger
# and plug-in of Polansky and Baker are calculated, using a normal kernel and a
# standard setting of parameters, in each case
h_CV<-CVbw(vec_data=x)$bw
# plug-in of Altman and Leger
h_AL<- ALbw(vec_data=x)
# plug-in of Polansky and Baker
h_PB<- PBbw(vec_data=x)
## Not run: print(h_CV); print(h_AL); print(h_PB)
# plot of the three estimates together with the real distribution
F_CV<-kde(vec_data=x, bw= h_CV)
F_AL<-kde(vec_data=x, bw= h_AL)
F_PB<-kde(vec_data=x, bw= h_PB)
y<-F_CV$grid
Ft<-pnorm(y)
require(graphics)
plot(y,Ft, ylab="Distribution", xlab="data", type="l", lty=1)
lines(y,F_CV$Estimated_values, type="l",lty=2)
lines(y,F_AL$Estimated_values, type="l",lty=3)
lines(y,F_PB$Estimated_values, type="l",lty=4)
legend(1,0.4,c("real","F_CV","F_AL","F_PB"),lty=1:4)
## End(Not run)