Let scatter for set X assigned as sigma(X) be defined as
vector of variances computed for particular dimensions.
Average scattering for clusters is defined as:
Scatt = (1/|C|) * sum{forall i in 1:|C|} ||sigma(Ci)||/||sigma(X)||
where:
|C|
- number of clusters,
i
- cluster id,
Ci
- cluster with id 'i',
X
- set with all objects,
||x||
- sqrt(x*x').
Standard deviation is defined as:
stdev = (1/|C|) * sqrt( sum{forall i in 1:|C|} ||sigma(Ci)|| )
Value
As result list with three values is returned.
Scatt
- average scattering for clusters value,
stdev
- standard deviation value,
cluster.center
- numeric matrix where columns
correspond to variables and rows to cluster centers.
# load and prepare data
library(clv)
data(iris)
iris.data <- iris[,1:4]
# cluster data
agnes.mod <- agnes(iris.data) # create cluster tree
v.pred <- as.integer(cutree(agnes.mod,5)) # "cut" the tree
# compute Scatt index
scatt <- clv.Scatt(iris.data, v.pred)