Last data update: 2014.03.03

R: Weiszfeld
WeiszfeldR Documentation

Weiszfeld

Description

Computes the Geometric median (also named spatial median or L1-median) with Weiszfeld's algorithm.

Usage

Weiszfeld(X, weights = NULL, epsilon=1e-08, nitermax = 100) 

Arguments

X

Data matrix, with n (rows) observations in dimension d (columns).

weights

When NULL, all observations have the same weight, say 1/n. Else, the user can provide a size n vector of weights (such as sampling weights). These weights are used in the estimating equation (see details).

epsilon

Numerical tolerance. By defaut 1e-08.

nitermax

Maxium number of iterations of the algorithm. By default set to 100.

Details

Weizfeld's algorithm (see Vardi and Zhang, 2000) is fast and accurate and can deal with large samples of high dimension data. However it is not as fast as the recursive approach proposed in Gmedian, which may be preferred for very large samples in high dimension. Weights can be given for statistical units, allowing to deal with data drawn from unequal probability sampling designs (see Lardin-Puech, Cardot and Goga, 2014).

Value

median

Vector of the geometric median.

iter

Number of iterations

References

Lardin-Puech, P., Cardot, H. and Goga, C. (2014). Analysing large datasets of functional data: a survey sampling point of view, J. de la SFdS, 155(4), 70-94.

Vardi, Y. and Zhang, C.-H. (2000). The multivariate L1-median and associated data depth. Proc. Natl. Acad. Sci. USA, 97(4):1423-1426.

See Also

See also Gmedian and WeiszfeldCov.

Examples

## Robustness of the geometric median of n=3 points in dimension d=2.
a1 <- c(-1,0); a2 <- c(1,0); a3 <-c(0,1)
data.mat <- rbind(a1,a2,a3)
plot(data.mat,xlab="x",ylab="y")
med.est <- Weiszfeld(data.mat)
points(med.est$median,pch=19)

 ### weighted units
poids = c(3/2,1,1)
plot(data.mat,xlab="x",ylab="y")
med.est <- Weiszfeld(data.mat,weights=poids)
plot(data.mat,xlab="x",ylab="y")
points(med.est$median,pch=19)

## outlier
data.mat[3,] <- c(0,10) 
plot(data.mat,xlab="x",ylab="y")
med.est <- Weiszfeld(data.mat)
points(med.est$median,pch=19)

## Computation speed
## Simulated data - Brownian paths
n <- 1e4
d <- 50
x <- matrix(rnorm(n*d,sd=1/sqrt(d)), n, d)
x <- t(apply(x,1,cumsum))

system.time(replicate(10, {
  median.est = Weiszfeld(x)}))
system.time(replicate(10, {
  median.est = Gmedian(x)})) 
system.time(replicate(10, {
  mean.est = apply(x,2,mean)}))

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(Gmedian)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/Gmedian/Weiszfeld.Rd_%03d_medium.png", width=480, height=480)
> ### Name: Weiszfeld
> ### Title: Weiszfeld
> ### Aliases: Weiszfeld
> ### Keywords: Weiszfeld
> 
> ### ** Examples
> 
> ## Robustness of the geometric median of n=3 points in dimension d=2.
> a1 <- c(-1,0); a2 <- c(1,0); a3 <-c(0,1)
> data.mat <- rbind(a1,a2,a3)
> plot(data.mat,xlab="x",ylab="y")
> med.est <- Weiszfeld(data.mat)
> points(med.est$median,pch=19)
> 
>  ### weighted units
> poids = c(3/2,1,1)
> plot(data.mat,xlab="x",ylab="y")
> med.est <- Weiszfeld(data.mat,weights=poids)
> plot(data.mat,xlab="x",ylab="y")
> points(med.est$median,pch=19)
> 
> ## outlier
> data.mat[3,] <- c(0,10) 
> plot(data.mat,xlab="x",ylab="y")
> med.est <- Weiszfeld(data.mat)
> points(med.est$median,pch=19)
> 
> ## Computation speed
> ## Simulated data - Brownian paths
> n <- 1e4
> d <- 50
> x <- matrix(rnorm(n*d,sd=1/sqrt(d)), n, d)
> x <- t(apply(x,1,cumsum))
> 
> system.time(replicate(10, {
+   median.est = Weiszfeld(x)}))
   user  system elapsed 
  0.200   0.000   0.199 
> system.time(replicate(10, {
+   median.est = Gmedian(x)})) 
   user  system elapsed 
  0.052   0.000   0.050 
> system.time(replicate(10, {
+   mean.est = apply(x,2,mean)}))
   user  system elapsed 
  0.052   0.008   0.060 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>