Last data update: 2014.03.03

R: Distance measure using Mahalanobis distance for outlier...
dm.mahalanobisR Documentation

Distance measure using Mahalanobis distance for outlier detection

Description

Implements Mahalanobis distance measure for outlier detection. In addition to the basic distance measure, boxplots are provided with potential outlier(s) to give an insight into the early stage of data cleansing task.

Usage

dm.mahalanobis(data, from="median", p=10, plot=FALSE, v.index=NULL, layout=NULL)

Arguments

data

Dataframe

from

Datum point from which the distance is measured
"mean" Mean of each column
"median" Median of each column (default)

p

Percentage to which outlier point(s) is noted (default of 10)

plot

Switch for boxplot(s)

v.index

Numeric vector indicating column(s) to be printed in the boxplot. Default value of NULL will present all.

layout

Numeric vector indicating dimension of boxplots. Default value of NULL will find an optimal layout.

Value

$dist

Mahalanobis distance from from

$excluded

Excluded row(s) in row number

$order

Distance order (decreasing) in row number

$suspect

Potential outlier(s) in row number

Author(s)

Dong-Joon Lim, PhD

References

Hair, Joseph F., et al. Multivariate data analysis. Vol. 7. Upper Saddle River, NJ: Pearson Prentice Hall, 2006.

Examples

# Generate a sample dataframe
df<-data.frame(replicate(6,sample(0:100,50)))

# go
dm.mahalanobis(df,plot=TRUE)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(DJL)
Loading required package: car
Loading required package: combinat

Attaching package: 'combinat'

The following object is masked from 'package:utils':

    combn

Loading required package: lpSolveAPI
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/DJL/dm.mahalanobis.Rd_%03d_medium.png", width=480, height=480)
> ### Name: dm.mahalanobis
> ### Title: Distance measure using Mahalanobis distance for outlier
> ###   detection
> ### Aliases: dm.mahalanobis
> 
> ### ** Examples
> 
> # Generate a sample dataframe
> df<-data.frame(replicate(6,sample(0:100,50)))
> 
> # go
> dm.mahalanobis(df,plot=TRUE)
$dist
        1         2         3         4         5         6         7         8 
 5.115846  5.885948  6.246973  5.147832 10.989481  2.296720  1.793039  4.939741 
        9        10        11        12        13        14        15        16 
 6.890873  8.640631  6.569473  8.053194  8.007438  2.082752  2.626381  6.935367 
       17        18        19        20        21        22        23        24 
 5.757327  3.002721  3.371891  6.894247  7.080088 11.803161  9.301110  4.610362 
       25        26        27        28        29        30        31        32 
 3.507237  3.799064 10.572960  6.415286  5.641336  7.022880  6.375719  5.886766 
       33        34        35        36        37        38        39        40 
 7.386680 10.258004  6.662484  8.858403  4.408337  3.029022  6.560825  4.704882 
       41        42        43        44        45        46        47        48 
 7.805392  4.347984  7.705506  5.043657  2.917270  5.240843  7.582324  8.955989 
       49        50 
 3.999685  2.338319 

$excluded
NULL

$order
 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 
22  5 27 34 23 48 36 10 12 13 41 43 47 33 21 30 16 20  9 35 11 39 28 31  3 32 
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 
 2 17 29 46  4  1 44  8 40 24 37 42 49 26 25 19 38 18 45 15 50  6 14  7 

$suspect
 1  2  3  4  5 
22  5 27 34 23 

> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>