Last data update: 2014.03.03
R: Distance measure using Mahalanobis distance for outlier...
dm.mahalanobis R Documentation
Distance measure using Mahalanobis distance for outlier detection
Description
Implements Mahalanobis distance measure for outlier detection. In addition to the basic distance measure, boxplots are provided with potential outlier(s) to give an insight into the early stage of data cleansing task.
Usage
dm.mahalanobis(data, from="median", p=10, plot=FALSE, v.index=NULL, layout=NULL)
Arguments
data
Dataframe
from
Datum point from which the distance is measured
"mean"
Mean of each column
"median"
Median of each column (default)
p
Percentage to which outlier point(s) is noted (default of 10)
plot
Switch for boxplot(s)
v.index
Numeric vector indicating column(s) to be printed in the boxplot.
Default value of NULL will present all.
layout
Numeric vector indicating dimension of boxplots.
Default value of NULL will find an optimal layout.
Value
$dist
Mahalanobis
distance from from
$excluded
Excluded row(s) in row number
$order
Distance order (decreasing) in row number
$suspect
Potential outlier(s) in row number
Author(s)
Dong-Joon Lim, PhD
References
Hair, Joseph F., et al. Multivariate data analysis. Vol. 7. Upper Saddle River , NJ: Pearson Prentice Hall, 2006.
Examples
# Generate a sample dataframe
df<-data.frame(replicate(6,sample(0:100,50)))
# go
dm.mahalanobis(df,plot=TRUE)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(DJL)
Loading required package: car
Loading required package: combinat
Attaching package: 'combinat'
The following object is masked from 'package:utils':
combn
Loading required package: lpSolveAPI
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/DJL/dm.mahalanobis.Rd_%03d_medium.png", width=480, height=480)
> ### Name: dm.mahalanobis
> ### Title: Distance measure using Mahalanobis distance for outlier
> ### detection
> ### Aliases: dm.mahalanobis
>
> ### ** Examples
>
> # Generate a sample dataframe
> df<-data.frame(replicate(6,sample(0:100,50)))
>
> # go
> dm.mahalanobis(df,plot=TRUE)
$dist
1 2 3 4 5 6 7 8
5.115846 5.885948 6.246973 5.147832 10.989481 2.296720 1.793039 4.939741
9 10 11 12 13 14 15 16
6.890873 8.640631 6.569473 8.053194 8.007438 2.082752 2.626381 6.935367
17 18 19 20 21 22 23 24
5.757327 3.002721 3.371891 6.894247 7.080088 11.803161 9.301110 4.610362
25 26 27 28 29 30 31 32
3.507237 3.799064 10.572960 6.415286 5.641336 7.022880 6.375719 5.886766
33 34 35 36 37 38 39 40
7.386680 10.258004 6.662484 8.858403 4.408337 3.029022 6.560825 4.704882
41 42 43 44 45 46 47 48
7.805392 4.347984 7.705506 5.043657 2.917270 5.240843 7.582324 8.955989
49 50
3.999685 2.338319
$excluded
NULL
$order
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
22 5 27 34 23 48 36 10 12 13 41 43 47 33 21 30 16 20 9 35 11 39 28 31 3 32
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
2 17 29 46 4 1 44 8 40 24 37 42 49 26 25 19 38 18 45 15 50 6 14 7
$suspect
1 2 3 4 5
22 5 27 34 23
>
>
>
>
>
> dev.off()
null device
1
>