R: Fast hierarchical, agglomerative clustering of dissimilarity...
hclust
R Documentation
Fast hierarchical, agglomerative clustering of dissimilarity data
Description
This function implements hierarchical clustering with the same interface as hclust from the stats package but with much faster algorithms.
Usage
hclust(d, method="complete", members=NULL)
Arguments
d
a dissimilarity structure as produced by dist.
method
the agglomeration method to be used. This must be (an
unambiguous abbreviation of) one of "single",
"complete", "average", "mcquitty",
"ward.D", "ward.D2", "centroid" or "median".
members
NULL or a vector with length the number of
observations.
Details
See the documentation of the original function
hclust in the stats package.
A comprehensive User's manual
fastcluster.pdf is available as a vignette. Get this from the R command line with vignette('fastcluster').
Value
An object of class 'hclust'. It encodes a stepwise dendrogram.
# Taken and modified from stats::hclust
#
# hclust(...) # new method
# stats::hclust(...) # old method
require(fastcluster)
require(graphics)
hc <- hclust(dist(USArrests), "ave")
plot(hc)
plot(hc, hang = -1)
## Do the same with centroid clustering and squared Euclidean distance,
## cut the tree into ten clusters and reconstruct the upper part of the
## tree from the cluster centers.
hc <- hclust(dist(USArrests)^2, "cen")
memb <- cutree(hc, k = 10)
cent <- NULL
for(k in 1:10){
cent <- rbind(cent, colMeans(USArrests[memb == k, , drop = FALSE]))
}
hc1 <- hclust(dist(cent)^2, method = "cen", members = table(memb))
opar <- par(mfrow = c(1, 2))
plot(hc, labels = FALSE, hang = -1, main = "Original Tree")
plot(hc1, labels = FALSE, hang = -1, main = "Re-start from 10 clusters")
par(opar)