K-d tree: nearest neighbor search algorithm with tractable pseudo code
Asked Answered
R

1

2

The pseudo-code for nearest neighbor (NN) search in Wikipedia is not tractable enough for me. Few more posts available with implementations, but they seem to be language specific. So I'm finding it hard to understand how NN search works. This diagram is taken from https://www.cs.cmu.edu/~ckingsf/bioinfo-lectures/kdtrees.pdf. And I'm trying to understand it with a specific case say query point Q = (52,52). Assume two dim are (x,y) and root level splits by x-dim. enter image description here

Searching NN:

First, I go down from root to a leaf as if I'm trying to insert the Q; and doing so, the leaf is (55,1). Update a (global) var current_best from INFINITY to (55-52)2 + (1-52)2 = 2610.

Next, I go up to (70,70) and updates current_best from 2610 to 182+182=648. Since this gives better distance, we must probe it's sub-tree: is this correct understanding?

Further, we see node (60,80) doesn't give better result (i.e. no update for current_best).

In the process of going up further, we find root (51,75) gives even better result (current_best gets set to 530). So, applying my understanding, we must check it's other sub-tree.

The (25,40) doesn't yield any better result. What I understand is that we still need to verify the sub-tree of (25,40). However, in this case, since this node uses y-dim, and since Q.y > 40, we need to check only right sub-tree (rooted at (35,90)): is this correct understanding?

In short, what I see is if a node provides better result for current_distance, we must probe both children nodes; and if a node does NOT provide better result, all we can do it ignore one of the sub-tree but must probe the other sub-tree based on the condition (splitting hyperplane by specific dimension). Is this correct understanding?

Finally, I'll really appreciate anyone providing a tractable pseudo-code for NN search for Kd-tree

Roar answered 14/8, 2019 at 7:5 Comment(0)
P
2

Imagine the target point and a disk around it, with radius equal to the shortest distance found so far (initially, infinity).

You are at the root of a tree that splits the plane in two half planes. Make the radius equal to the minimum of the current radius and the distance from the target to the root. Then recurse to the half-plane(s) that intersect(s) the disk, as long as the root has sons.

Make sure to keep a trace of which root achieved the minimum.

Visit(root):
    d= distance(target, root)
    if d < r:
        r= d
        closest= root

    if root.left and root.x - target.x < r:
        Visit(root.left)
    if root.right and target.x - root.x < r:
        Visit(root.right)

Caution: the half-plane test is on x or y depending on the axis selection policy that you use.

Protestation answered 14/8, 2019 at 8:8 Comment(6)
@ Yves Daoust - By this algorithm, we'll probe sub-tree rooted at (10,30) - this seems not needed. This is my trace for target (52,52): visit(51,75) will set r=24. Test (root.x - target.x < r i.e. 51-52 < 24) succeeds, and visit(25,40) is invoked. visit(25,40) doesn't update r, but succeeds the test (root.y - target.y < r i.e. 40-52 < 24). So it invokes visit(10,30). Am I missing something!Roar
@KGhatak: how do you know it is not needed ?Protestation
@ Yves Daoust - My bad, it's indeed needed. If there would have been a point around the corner (51,40), then that would be missed out otherwise. Thanks for pointing it out.Roar
@KGhatak: in fact you can associate to every root a rectangle that it "supervises". After a split, you have two new roots which supervise each one subrectangle. And you need to visit every rectangle that intersects the current disk. As an optimization, you can use the nearest-first heuristic.Protestation
@Roar This is OK, but I like to use a priority queue to follow the most promising path first, like: #41306622Foe
@ Matt Timmermans - I'm not able to figure out how a priority queue is used to our advantage. Do we ignore inserting a children when it's distance is greater than the current max. It might be useful to add that clarity (or am I missing something)Roar

© 2022 - 2024 — McMap. All rights reserved.