Last data update: 2014.03.03

R: A Plot of SD Index Values for K-Means Clustering Solutions
SDIndexR Documentation

A Plot of SD Index Values for K-Means Clustering Solutions

Description

Provides a plot of SD cluster validation index values for different numbers of k-means clusters for a common underlying dataset. The number of clusters that has the lowest value of the SD index represents the "best" solution under the criteria used to construct the SD index.

Usage

SDIndex(x, minClust, maxClust, iter.max=10, num.seeds=10)

Arguments

x

A numeric matrix of data, or an object that can be coerced to such a matrix (such as a numeric vector or a dataframe with all numeric columns).

minClust

The minimum number of clusters to be considered for a solution.

maxClust

The maximum number of clusters to be considered for a solution.

iter.max

The maximum number of iterations allowed for a solution.

num.seeds

The number of different starting random seeds to use for a solution with a given number of clusters.

Details

The SD index corresponds to the weighted sum of the average "scattering" of points within clusters and the inverse of the total seperation between clusters. The average scattering measure is based on the average sum of the squared differences between a clusters centroid all the points in a cluster, while total seperation is measured by the sum of the squared distance between cluster centroids. A solution with a low average scattering and a low value of the inverse total seperation is considered to be better than a solution with higher levels of these two measures.

Value

The function returns invisibly. Its benefit is the side effect plot produced.

Author(s)

Dan Putler

References

M. Haldiki, Y. Batistakis, M. Vazirgiannis (2001), On Clustering Validation Techniques, Journal of Intelligent Information Systems, 17:2/3.

See Also

KMeans, SD.clv

Examples

  data(iris)
  iris.data <- iris[,1:4]
  SDIndex(iris.data, minClust=2, maxClust=6, iter.max=10, num.seeds=10)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(BCA)
Error in library(BCA) : there is no package called 'BCA'
Execution halted