Use direct plug-in methodology to select the bandwidth
of a local linear Gaussian kernel regression estimate, as described
by Ruppert, Sheather and Wand (1995).
numeric vector of x data.
Missing values are not accepted.
y
numeric vector of y data.
This must be same length as x, and
missing values are not accepted.
blockmax
the maximum number of blocks of the data for construction
of an initial parametric estimate.
divisor
the value that the sample size is divided by to determine
a lower limit on the number of blocks of the data for
construction of an initial parametric estimate.
trim
the proportion of the sample trimmed from each end in the
x direction before application of the plug-in methodology.
proptrun
the proportion of the range of x at each end truncated in the
functional estimates.
gridsize
number of equally-spaced grid points over which the
function is to be estimated.
range.x
vector containing the minimum and maximum values of x at which to
compute the estimate.
For density estimation the default is the minimum and maximum data values
with 5% of the range added to each end.
For regression estimation the default is the minimum and maximum data values.
truncate
logical flag: if TRUE, data with x values outside the
range specified by range.x are ignored.
Details
The direct plug-in approach, where unknown functionals
that appear in expressions for the asymptotically
optimal bandwidths
are replaced by kernel estimates, is used.
The kernel is the standard normal density.
Least squares quartic fits over blocks of data are used to
obtain an initial estimate. Mallow's Cp is used to select
the number of blocks.
Value
the selected bandwidth.
Warning
If there are severe irregularities (i.e. outliers, sparse regions)
in the x values then the local polynomial smooths required for the
bandwidth selection algorithm may become degenerate and the function
will crash. Outliers in the y direction may lead to deterioration
of the quality of the selected bandwidth.
References
Ruppert, D., Sheather, S. J. and Wand, M. P. (1995).
An effective bandwidth selector for local least squares
regression.
Journal of the American Statistical Association,
90, 1257–1270.
Wand, M. P. and Jones, M. C. (1995).
Kernel Smoothing.
Chapman and Hall, London.
See Also
ksmooth, locpoly.
Examples
data(geyser, package = "MASS")
x <- geyser$duration
y <- geyser$waiting
plot(x, y)
h <- dpill(x, y)
fit <- locpoly(x, y, bandwidth = h)
lines(fit)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(KernSmooth)
KernSmooth 2.23 loaded
Copyright M. P. Wand 1997-2009
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/KernSmooth/dpill.Rd_%03d_medium.png", width=480, height=480)
> ### Name: dpill
> ### Title: Select a Bandwidth for Local Linear Regression
> ### Aliases: dpill
> ### Keywords: smooth
>
> ### ** Examples
>
> data(geyser, package = "MASS")
> x <- geyser$duration
> y <- geyser$waiting
> plot(x, y)
> h <- dpill(x, y)
> fit <- locpoly(x, y, bandwidth = h)
> lines(fit)
>
>
>
>
>
> dev.off()
null device
1
>