how to find k-th nearest neighbor of a point in a set of point
Asked Answered
I

5

7

I have a set of point (x,y) on a 2d plane. Given a point (x0,y0), and the number k, how to find the k-th nearest neighbor of (x0,x0) in the point set. In detail, the point set are represented by two array: x and y. The point (x0,y0) is given by the index i0. It means x0=x(i0) and y0=y(i0).

Is there any function or something in Matlab helps me this problem. If Matlab doesn't have such kind of function, can you suggest any other effective ways.

EDIT: I have to calculate this kind of distance for every point (x0,y0) in the set. The size of the set is about 1000. The value of k should be about sqrt(1500). The worst thing is that I do this many times. At each iteration, the set changes, and I calculate the distances again. So, the running time is a critical problem.

Isla answered 22/2, 2012 at 21:43 Comment(0)
C
6

if you will do this check for many points you might want to construct a inter-point distance table first

squareform(pdist([x y]))
Cytoplast answered 23/2, 2012 at 1:47 Comment(5)
Yes. I will do this for every points in the set. So, some kind of table of distance would help saving the running time. I will find out how squareform function works in my problem. Thank you very much.Isla
the function is pdist actually, squareform just makes the a square matrix out of pdist's vector outputCytoplast
But only if you have the statistics toolbox.Asante
it works for me. Maybe it's not the best approach, but it's simple to implement and use. To be honest, I don't have much time to finish the code. Thank you very much.Isla
This is O(N^2), where N is the number of points, with each query then being O(N lg N). A kd-tree, mentioned below, and also used under some conditions by knnsearch, is typically much faster, taking O(N lg^2 N) to construct, O(lg N) for 1 nearest neighbour, and, I'm guessing, around O(k(lg k)(lg N)) for k nearest neighbours when k is small and with a favourable dataset (thinking: via binary search). (Which means, for example, that if you have 10000 points, it should be somewhere around 100 to 1000 times faster.) Therefore knnsearch is much preferred.Kenay
G
4

If you have the statistics toolbox, you can use the function knnsearch.

Gittens answered 22/2, 2012 at 21:57 Comment(3)
knnsearch seems to be a solution but I'm not sure about how to apply knnsearch to my problem exactly. I'll find it. Anyway, can you give me more detail about the way using knnsearch. Thank you very much.Isla
Have you looked at the Matlab help (added link to my answer above)?Gittens
I read the online document about knnsearch but it's a little bit complicate to me and I really don't have much time to understand and use it. I tried the simpler approach. It takes more time of running but I will try that approach first. Thank you for your help.Isla
S
2

A brute force algorithm would be something like this:

array x[n] = ()
array y[n] = () 
array d[n] = ()

... populate x and y arrays with your n points ...

/* step over each point and calculate its distance from (x0, y0) */
for i = 1 to n
do
  d[i] = distance((x0, y0), (x[i], y[i])
end 

/* sort the distances in increasing order */
sort(d)

/* the k'th element of d, is the k'th nearest point to (x0, y0) */
return d[k]
Scabies answered 22/2, 2012 at 21:59 Comment(0)
M
2

The brute force approach looks something like this:

%Create some points
n = 10;
x = randn(n,1);
y = randn(n,1);

%Choose x0
ix0 = randi(n);

%Get distances
d = sqrt(...
    (x - x(ix0) ).^2 + ...
    (y - y(ix0) ).^2 );

%Sort distances
[sorted_Dstances, ixSort] = sort(d);

%Get kth point
k = 3;
kth = [x(ixSort(k+1)); y(ixSort(k+1))]; %+1 since the first element will always be the x0 element.
Marvellamarvellous answered 22/2, 2012 at 22:0 Comment(4)
Shouldn't you remove the element itself? Think about the case k=1Asante
Good point. I generally don't like changing the sizes of match vectors like this. Maybe add a '+1' to the final indexing. EDIT: although that leaves a gap if there is a point identical to the initial point. If you want to guarantee the answer is a different point, even when some point could be equal, then more work is needed.Marvellamarvellous
@Marvellamarvellous it doesn't make sense to also remove the equal points. if you have 5 equal points to the point you're searching, the 3rd furthest distance should be 0Scabies
thank you very much! but I have to calculate this kind of distance with every point in the set. So, it seems that this approach is a little simpler than I expected. I'm sorry I didn't clarify earlier.Isla
D
2

The free and opensource VLFeat toolbox contains a kd-tree implementation, amongst other useful things.

Decanter answered 22/2, 2012 at 22:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.