Last data update: 2014.03.03

R: kGmedian
kGmedianR Documentation

kGmedian

Description

Fast k-medians clustering based on recursive averaged stochastic gradient algorithms. The procedure is similar to the kmeans clustering technique performed recursively with the MacQueen algorithm. The advantage of the kGmedian algorithm compared to MacQueen strategy is that it deals with sum of norms instead of sum of squared norms, ensuring a more robust behaviour against outlying values.

Usage

kGmedian(X, ncenters=2, gamma=1, alpha=0.75, nstart = 10, nstartkmeans = 10)

Arguments

X

matrix, with n observations (rows) in dimension d (columns).

ncenters

Either the number of clusters, say k, or a set of initial (distinct) cluster centres. If a number, the initial centres are chosen as the output of the kmeans function computed with the MacQueen algorithm.

gamma

Value of the constant controling the descent steps (see details).

alpha

Rate of decrease of the descent steps.

nstart

Number of times the algorithm is ran, with random sets of initialization centers chosen among the observations.

nstartkmeans

Number of initialization points in the kmeans function for choosing the starting point of kGmedian.

Details

See Cardot, Cenac and Monnez (2012).

Value

cluster

A vector of integers (from 1:k) indicating the cluster to which each point is allocated.

centers

A matrix of cluster centres.

withinsrs

Vector of within-cluster sum of norms, one component per cluster.

size

The number of points in each cluster.

References

Cardot, H., Cenac, P. and Monnez, J-M. (2012). A fast and recursive algorithm for clustering large datasets with k-medians. Computational Statistics and Data Analysis, 56, 1434-1449.

Cardot, H., Cenac, P. and Zitt, P-A. (2013). Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm. Bernoulli, 19, 18-43.

MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, eds L. M. Le Cam and J. Neyman, 1, pp. 281-297. Berkeley, CA: University of California Press.

See Also

See also Gmedian and kmeans.

Examples

# a 2-dimensional example 
x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
           matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
colnames(x) <- c("x", "y")

cl.kmeans <- kmeans(x, 2)
cl.kmedian <- kGmedian(x)

par(mfrow=c(1,2))
plot(x, col = cl.kmeans$cluster, main="kmeans")
points(cl.kmeans$centers, col = 1:2, pch = 8, cex = 2)

plot(x, col = cl.kmedian$cluster, main="kmedian")
points(cl.kmedian$centers, col = 1:2, pch = 8, cex = 2)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(Gmedian)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/Gmedian/kGmedian.Rd_%03d_medium.png", width=480, height=480)
> ### Name: kGmedian
> ### Title: kGmedian
> ### Aliases: kGmedian
> ### Keywords: Gmedian
> 
> ### ** Examples
> 
> # a 2-dimensional example 
> x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
+            matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
> colnames(x) <- c("x", "y")
> 
> cl.kmeans <- kmeans(x, 2)
> cl.kmedian <- kGmedian(x)
> 
> par(mfrow=c(1,2))
> plot(x, col = cl.kmeans$cluster, main="kmeans")
> points(cl.kmeans$centers, col = 1:2, pch = 8, cex = 2)
> 
> plot(x, col = cl.kmedian$cluster, main="kmedian")
> points(cl.kmedian$centers, col = 1:2, pch = 8, cex = 2)
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>