R: Diagnostic plot for identifying local outliers with fixed...
locoutPercent
R Documentation
Diagnostic plot for identifying local outliers with fixed size of neighborhood
Description
Computes global and pairwise Mahalanobis distances for visualizing global and
local multivariate outliers. The size of the neighborhood (number of neighbors) is
fixed, but the fraction of neighbors is varying.
Usage
locoutPercent(dat, X, Y, dist = NULL, k = 10, chisqqu = 0.975, sortup = 10, sortlow = 90,
nlinesup = 10, nlineslow = 10, indices = NULL, xlab = "(Sorted) Index",
ylab = "Distance to neighbor", col = gray(0.7), ...)
Arguments
dat
multivariate data set (without coordinates)
X
X coordinates of the data points
Y
Y coordinates of the data points
dist
maximum distance to search for neighbors; if nothing is provided, k for kNN is used
k
number of nearest neighbors to search - not taken if a value for dist is provided
chisqqu
quantile of the chisquare distribution for splitting the plot
sortup
sort local outliers accorting to given percentage
sortlow
sort local inliers accorting to given percentage
nlinesup
number of lines to be plotted for upper part
nlineslow
number of lines to be plotted for lower part
indices
if this is not NULL, these should be indices of observations to be highlighted
xlab
x-axis label for plot
ylab
y-axis label for plot
col
color for lines
...
additional parameters for plotting
Details
For this diagnostic tool, the number of neighbors is fixed, but propneighb (called beta) is
varied. For each observation we compute the degree of isolation from a fraction of 1-beta of
its neighbors. Neighborhood can be defined either via the Euclidean distance or by k-Nearest-Neighbors.
The critical value for outliers is the quantile chisqqu of the chisquare distribution.
One can also provide indices of observations (for indices). Then the corresponding lines in
the plots will be highlighted.
Value
ret
list containing indices of regular and outlying observations