Last data update: 2014.03.03

R: Search Nearest Neighbors
get.knnR Documentation

Search Nearest Neighbors

Description

Fast k-nearest neighbor searching algorithms including a kd-tree, cover-tree and the algorithm implemented in class package.

Usage

  get.knn(data, k=10, algorithm=c("kd_tree", "cover_tree", "CR", "brute"))
  get.knnx(data, query, k=10, algorithm=c("kd_tree", "cover_tree", "CR", "brute"))

Arguments

data

an input data matrix.

query

a query data matrix.

algorithm

nearest neighbor searching algorithm.

k

the maximum number of nearest neighbors to search. The default value is set to 10.

Details

The cover tree is O(n) space data structure which allows us to answer queries in the same O(log(n)) time as kd tree given a fixed intrinsic dimensionality. Templated code from http://hunch.net/~jl/projects/cover_tree/cover_tree.html is used.

The kd tree algorithm is implemented in the Approximate Near Neighbor (ANN) C++ library (see http://www.cs.umd.edu/~mount/ANN/). The exact nearest neighbors are searched in this package.

The CR algorithm is the VR using distance 1-x'y assuming x and y are unit vectors. The brute algorithm searches linearly. It is a naive method.

Value

a list contains:

nn.index

an n x k matrix for the nearest neighbor indice.

nn.dist

an n x k matrix for the nearest neighbor Euclidean distances.

Author(s)

Shengqiao Li. To report any bugs or suggestions please email: shli@stat.wvu.edu.

References

Bentley J.L. (1975), “Multidimensional binary search trees used for associative search,” Communication ACM, 18, 309-517.

Arya S. and Mount D.M. (1993), “Approximate nearest neighbor searching,” Proc. 4th Ann. ACM-SIAM Symposium on Discrete Algorithms (SODA'93), 271-280.

Arya S., Mount D.M., Netanyahu N.S., Silverman R. and Wu A.Y. (1998), “An optimal algorithm for approximate nearest neighbor searching,” Journal of the ACM, 45, 891-923.

Beygelzimer A., Kakade S. and Langford J. (2006), “Cover trees for nearest neighbor,” ACM Proc. 23rd international conference on Machine learning, 148, 97-104.

See Also

nn2 in RANN, ann in yaImpute and knn in class.

Examples

  data<- query<- cbind(1:10, 1:10)

  get.knn(data, k=5)
  get.knnx(data, query, k=5)
  get.knnx(data, query, k=5, algo="kd_tree")

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(FNN)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/FNN/get.knn.Rd_%03d_medium.png", width=480, height=480)
> ### Name: get.knn
> ### Title: Search Nearest Neighbors
> ### Aliases: get.knn get.knnx
> ### Keywords: manip
> 
> ### ** Examples
> 
>   data<- query<- cbind(1:10, 1:10)
> 
>   get.knn(data, k=5)
$nn.index
      [,1] [,2] [,3] [,4] [,5]
 [1,]    2    3    4    5    6
 [2,]    1    3    4    5    6
 [3,]    2    4    1    5    6
 [4,]    3    5    2    6    1
 [5,]    4    6    3    7    2
 [6,]    7    5    8    4    9
 [7,]    8    6    9    5   10
 [8,]    9    7   10    6    5
 [9,]   10    8    7    6    5
[10,]    9    8    7    6    5

$nn.dist
          [,1]     [,2]     [,3]     [,4]     [,5]
 [1,] 1.414214 2.828427 4.242641 5.656854 7.071068
 [2,] 1.414214 1.414214 2.828427 4.242641 5.656854
 [3,] 1.414214 1.414214 2.828427 2.828427 4.242641
 [4,] 1.414214 1.414214 2.828427 2.828427 4.242641
 [5,] 1.414214 1.414214 2.828427 2.828427 4.242641
 [6,] 1.414214 1.414214 2.828427 2.828427 4.242641
 [7,] 1.414214 1.414214 2.828427 2.828427 4.242641
 [8,] 1.414214 1.414214 2.828427 2.828427 4.242641
 [9,] 1.414214 1.414214 2.828427 4.242641 5.656854
[10,] 1.414214 2.828427 4.242641 5.656854 7.071068

>   get.knnx(data, query, k=5)
$nn.index
      [,1] [,2] [,3] [,4] [,5]
 [1,]    1    2    3    4    5
 [2,]    2    1    3    4    5
 [3,]    3    2    4    1    5
 [4,]    4    3    5    2    6
 [5,]    5    4    6    3    7
 [6,]    6    7    5    8    4
 [7,]    7    8    6    9    5
 [8,]    8    9    7   10    6
 [9,]    9   10    8    7    6
[10,]   10    9    8    7    6

$nn.dist
      [,1]     [,2]     [,3]     [,4]     [,5]
 [1,]    0 1.414214 2.828427 4.242641 5.656854
 [2,]    0 1.414214 1.414214 2.828427 4.242641
 [3,]    0 1.414214 1.414214 2.828427 2.828427
 [4,]    0 1.414214 1.414214 2.828427 2.828427
 [5,]    0 1.414214 1.414214 2.828427 2.828427
 [6,]    0 1.414214 1.414214 2.828427 2.828427
 [7,]    0 1.414214 1.414214 2.828427 2.828427
 [8,]    0 1.414214 1.414214 2.828427 2.828427
 [9,]    0 1.414214 1.414214 2.828427 4.242641
[10,]    0 1.414214 2.828427 4.242641 5.656854

>   get.knnx(data, query, k=5, algo="kd_tree")
$nn.index
      [,1] [,2] [,3] [,4] [,5]
 [1,]    1    2    3    4    5
 [2,]    2    1    3    4    5
 [3,]    3    2    4    1    5
 [4,]    4    3    5    2    6
 [5,]    5    4    6    3    7
 [6,]    6    7    5    8    4
 [7,]    7    8    6    9    5
 [8,]    8    9    7   10    6
 [9,]    9   10    8    7    6
[10,]   10    9    8    7    6

$nn.dist
      [,1]     [,2]     [,3]     [,4]     [,5]
 [1,]    0 1.414214 2.828427 4.242641 5.656854
 [2,]    0 1.414214 1.414214 2.828427 4.242641
 [3,]    0 1.414214 1.414214 2.828427 2.828427
 [4,]    0 1.414214 1.414214 2.828427 2.828427
 [5,]    0 1.414214 1.414214 2.828427 2.828427
 [6,]    0 1.414214 1.414214 2.828427 2.828427
 [7,]    0 1.414214 1.414214 2.828427 2.828427
 [8,]    0 1.414214 1.414214 2.828427 2.828427
 [9,]    0 1.414214 1.414214 2.828427 4.242641
[10,]    0 1.414214 2.828427 4.242641 5.656854

> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>