R: Gaussian Kernel Distance Computation
Gaussian Kernel Distance Computation


Given a N by D numeric data matrix, this function computes the N by N distance matrix with the pairwise distances between the rows of the data matrix as measured by a Gaussian Kernel.


gausskernel(X = NULL, sigma = NULL)



N by N numeric data matrix.


Positive scalar that specifies the bandwidth of the Gaussian kernel (see details).


Given two D dimensional vectors x_i and x_j. The Gaussian kernel is defined as

k(x_i,x_j)=exp(-|| x_i - x_j ||^2 / sigma^2)

where ||x_i - x_j|| is the Euclidean distance given by

||x_i - x_j||=((x_i1-x_j1)^2 + (x_i2-x_j2)^2 + ... + (x_iD-x_jD)^2)^.5

and sigma^2 is the bandwidth of the kernel.

Note that the Gaussian kernel is a measure of similarity between x_i and x_j. It evalues to 1 if the x_i and x_j are identical, and approaches 0 as x_i and x_j move further apart.

The function relies on the dist function in the stats package for an initial estimate of the euclidean distance.


An N by N numeric distance matrix that contains the pairwise distances between the rows in X.


Jens Hainmueller (Stanford) and Chad Hazlett (MIT)

See Also

dist function in the stats package.


X <- matrix(rnorm(6),ncol=2)


> X <- matrix(rnorm(6),ncol=2)
> gausskernel(X=X,sigma=1)
           1         2          3
1 1.00000000 0.4612798 0.08013023
2 0.46127984 1.0000000 0.38238943
3 0.08013023 0.3823894 1.00000000
