finding nearest neighbor for python numpy.ndarray in 3d-space
Asked Answered
I

1

8

I have a numpy.ndarray of 3d-points, i.e. the np.shape of it is (4350,3) and such a second numpy.ndarray of 3d-points of np.shape (10510,3). Now I am trying to find the right python-package to calculate the nearest neighbors in the second array of the points in the first array as quickly as possible.

I've found a quite similar question here: find the k nearest neighbours of a point in 3d space with python numpy but I don't understand how to use the solution there for my problem.

I'd very, very much appreciate your help on this!

Illegible answered 9/1, 2019 at 16:42 Comment(7)
Is this a one time operation or are you going to be finding multiple nearest neighbors in the same set? I ask that because if it is a single operation and you are literally only looking for 1 point on new sets each time then a simple 1 time loop looking for the smallest squared distance will suffice and be the fastest.Mccue
@Victor'Chris'Cabral No, what I've implemented so far is finding the nearest neighbor through calculating the euclidean distance for each point of the first set for each point of the second one (4350*10510 = 45718500 times) and returning the points for the closest distances. But all this I do in a while-loop that runs ~20times and for several "first" pointsets, so this naive classical approach takes several hours.Illegible
I definitely misspoke. I see your problem now.Mccue
I would suggest making a kdtree out of the smaller set and then doing the loop. docs.scipy.org/doc/scipy-0.14.0/reference/generated/…Mccue
@Victor'Chris'Cabral Thanks for the link and the suggestion! I had read through that documention before, but I don't know how to apply it to my problem (sorry if this should be clear from the documentation, I'm not very good at programming). Could you give me an example of how this could look like for my problem (I mean, for two 3D-pointsets in general)?Illegible
I've recently been working on a kdtree problem myself. Here's a great primer.Lymphocyte
@lc3fr0g Thanks for the links! My question is answered, but I'll definitely watch the videos.Illegible
M
9

Here is the KDTree way :

from scipy.spatial import KDTree

data= np.random.rand(10510,3)
sample= np.random.rand(4350,3)
kdtree=KDTree(data)

Then dist,points=kdtree.query(sample,2) will give you the 2 best neighbors for the 4350 candidates in about one second.

Magnanimity answered 9/1, 2019 at 17:41 Comment(5)
Thank you so much, I will try it this way and then let you know if it worked!Illegible
I'm sorry, I don't get what the returns of kdtree.query(sample,2) do mean. I'd have expected that in dist there are the Euclidean (since parameter 2 stands for the Euclidean distance) distances to the nearest neighbors and points are the points from the pointset "data", which are nearest to "sample". But dist looks like this: print(dist) [[0.02731417 0.03267154] [0.02175954 0.04624616] ... [0.03183459 0.03818426] [0.01794547 0.03079906]] and points like this: print(points) [[ 262 5545] [3667 5619] ... [ 696 9467] [9617 1987]] so I'm clearly wrong.Illegible
Oh, I think kdtree.query(sample,2) is interpreted as kdtree.query(sample, k=2) instead of kdtree.query(sample, p=2), but what I need is the latter, right?Illegible
no p=2 is the default. k is for k-nearest points. if you want just the nearest kdtree.query(sample,1) or kdtree.query(sample) is suffisant since k=1 is the default.Magnanimity
Yes, that's what I'd meant. Thanks for your answer, very nice of you to help!Illegible

© 2022 - 2024 — McMap. All rights reserved.