Two sets of high dimensional points: Find the nearest neighbour in the other set

FLANN

If your data do really lie in a high dimensional space, then you could use FLANN.

It actually builds a number of rotated kd-trees and queries (a bit) every single tree, keeping the best results found. It also rotates the data-set to avoid nasty cases.

In the publications section you can read more on how it works.

In Getting FLANN section you can also read the manual.

However, since you wish to perform Nearest Neighbour Searching(NNS) in a high dimensional space, you need to accept the trade-off between time and accuracy (more time comes with more accuracy). That's why FLANN performs approximate NNS (check this answer for more).

LSH

As an alternative, I would suggest the LSH algorithm. Here is E²LSH, which actually implements LSH algorithm. The manual can be found here.

The idea behind the algorithm is that we want the points that lie near to each other to be placed (with a high probability) in the same bucket. However, LSH is devoted in solving the R nearest neighbour problem.

By R-near neighbour data structure, the author probably means that given a query point q, we can answer this question: "Which points of the dataset lie inside radius R from q?".

However, the manual explains how can LSH be used to perform NN searching.

Note that this type of questions are not for this site. I answered to you because you are a new user. Next time make sure you don't forget that. :)

Recommended topics

Hot tags