Last data update: 2014.03.03

R: Outlier detection
outlierR Documentation

Outlier detection

Description

Identifies outliers based on the nearest neighbour criterion. It starts by computing a matrix of distances (correlation, r, used as distance, dr=(1-r)/2). Variables with nearest neighbour distance larger than parameter thresh are considered outliers.

Usage

outlier(veg, thresh, y,...)
outly(veg, thresh = 0.2, y = 0.5)

## Default S3 method:
outlier(veg, thresh, y,...)
## S3 method for class 'outlier'
plot(x,...)
## S3 method for class 'outlier'
print(x,...)

Arguments

veg

This is a vegetation data frame, releves are rows, species columns

thresh

Threshold nearest neighbour distance for outliers

y

Transformation of species scores: x'= x exp(y)

x

An object of class "outlier"

...

Parameter out.seq, the plotting interval

Value

An object of class "oulier" with at least the following items:

threshold

Threshold nearest neighbour distance for considering outliers

y

Transformation of species scores: x'= x exp(y)

rel.names

All row names

neigh.names

Names of the corresponding nearest neighbours

neigh.dist

Distance to the nearest neighbour

olddim

Dimensions of data frame veg

newdim

Dimensions of data frame with outliers erased

new.data

Vegetation data frame without outliers

pco.points

The pco ordination scores use for plotting

Author(s)

Otto Wildi

References

Wildi, O. 2013. Data Analysis in Vegetation Ecology. 2nd ed. Wiley-Blackwell, Chichester.

Examples

o.outlier<- outlier(nveg,thresh=0.2,y=0.5)
o.outlier                                    # a list of all variables
plot(o.outlier)                              # nearest neighbour histogram and
                                             # pco ordination 

Results