Is there any way to add points to KD tree implementation in Scipy
Asked Answered
C

2

24

I have a set of points for which I want to construct KD Tree. After some time I want to add few more points to this KDTree periodically. Is there any way to do this in scipy implementation

Cyrillus answered 23/7, 2013 at 18:13 Comment(2)
I've posted this as a feature request on the Scipy Github: github.com/scipy/scipy/issues/9029Squib
If you're interested: I've added a more complete exampleAngioma
P
23

The problem with k-d-trees is that they are not designed for updates.

While you can somewhat easily insert objects (if you use a pointer based representation, which needs substantially more memory than an array-based tree), and do deletions with tricks such as tombstone messages, doing such changes will degrate the performance of the tree.

I am not aware of a good method for incrementally rebalancing a k-d-tree. For 1-dimensional trees you have red-black-trees, B-trees, B*-trees, B+-trees and such things. These don't obviously work with k-d-trees because of the rotating axes and thus different sorting. So in the end, with a k-d-tree, it may be best to just collect changes, and from time to time do a full tree rebuild. Then at least this part of the tree will be quite good.

However, there exists a similar structure (that in my experiments often outperforms the k-d-tree!): the R*-tree. Instead of performing binary splits, it uses rectangular bounding boxes to collect objects, and a lot of thought was put into making the tree a dynamic data structure. This is also where the R*-tree performs much better than the R-tree: it has a much more clever split for kNN search, and it performs incremental rebalancing to improve its structure.

Phrasing answered 23/7, 2013 at 22:42 Comment(3)
Any implementation of R*-tree in Python/Scipy?Adora
Is a quad-tree better in updating?Subdivision
@Adora Yes, see github.com/Toblerity/rtreePrue
A
1

With https://github.com/stefankoegl/kdtree, it's possible to start with an empty KD-Tree and add nodes one after the other.

pip install kdtree

import kdtree
# Create an empty tree by specifying the number of
# dimensions its points will have
tree = kdtree.create(dimensions=3)
tree.add((1, 2, 3))

I have no idea if it's slower than the one provided by SciPy, but at least it worked fine for my needs (adding points to the Tree, unless there's a point which is too close already).

Here's a more complete example:

import random
import matplotlib.pyplot as plt
import kdtree

SIZE = 100
N = 10_000
MIN_DISTANCE = 5
MIN_DISTANCE_2 = MIN_DISTANCE * MIN_DISTANCE

x, y = random.uniform(0, SIZE), random.uniform(0, SIZE)

tree = kdtree.create(dimensions=2)
tree.add((x, y))

xs, ys = [], []
not_xs, not_ys = [], []

for _ in range(N):
    x, y = random.uniform(0, SIZE), random.uniform(0, SIZE)
    _neighbour, distance_2 = tree.search_nn((x, y))
    if distance_2 > MIN_DISTANCE_2:
        xs.append(x)
        ys.append(y)
        tree.add((x, y))
    else:
        not_xs.append(x)
        not_ys.append(y)

plt.axes().set_aspect(1)
plt.scatter(not_xs, not_ys, color='red', s=0.1)
plt.scatter(xs, ys)
plt.show()

It tries to place 10000 points randomly in a 100 * 100 square, but not closer than 5. Placed points are blue, rejected points are red.

many points in a square

Angioma answered 13/2 at 14:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.