Last data update: 2014.03.03

R: Variation of Information Distance for Clusterings
vi.distR Documentation

Variation of Information Distance for Clusterings

Description

Computes the 'variation of information' distance of Meila (2007) between two clusterings/partitions of the same objects.

Usage

vi.dist(cl1, cl2, parts = FALSE, base = 2)

Arguments

cl1,cl2

vectors of cluster memberships (need to have the same lengths).

parts

logical; should the two conditional entropies also be returned?

base

base of logarithm used for computation of entropy and mutual information.

Details

The variation of information distance is the sum of the two conditional entropies of one clustering given the other. For details see Meila (2007).

Value

The VI distance. If parts=TRUE the two conditional entropies are appended.

Author(s)

Arno Fritsch, arno.fritsch@tu-dortmund.de

References

Meila, M. (2007) Comparing Clusterings - an Information Based Distance. Journal of Multivariate Analysis, 98, 873 – 895.

See Also

arandi

Examples

 cl1 <- sample(1:3,10,replace=TRUE)
 cl2 <- c(cl1[1:5], sample(1:3,5,replace=TRUE))
 vi.dist(cl1,cl2)
 vi.dist(cl1,cl2, parts=TRUE)

Results