I'm trying to figure out which structure would be better for doing several radius search of points, a kd-tree or an octree? It was already mentioned in this question but there was no answer. It seems to me that since octrees have fixed sizes for the leafs it can already be computed the branches that I need to visit while for kd-tree you have to iteratively visit branches until radius is covered.
I've implemented both personally and precisely for this purpose would vote for the octree. I found it much easier to get a more efficient results with the octree. I say easier because I think with these kinds of subtle distinctions, it's really more about the implementer more than the data structure. But I think for most people, you'd have an easier time optimizing the octree.
One of the reasons is because K-D trees are inherently deeper being binary trees splitting on one dimension at a time. That deeper nature can be helpful if you're looking for a precise matching element at the leaf like for a ray/triangle intersection with a single, unambiguous path down the tree. It's useful when a deep tree, carefully split, matches up with the idea of search quality.
It's not so helpful to have a deep, carefully-split tree if you're searching for the nearest point within a maximum radius where you end up spending most of the time just going up and down the tree, from leaf to parent to sibling to grandparent to parent sibling and so on. There it helps to be a bit flatter if you can access everything in a cache-friendly way, and you can easily make an octree cache-friendly like storing all 8 children contiguously, at which point you can just do this:
struct OctreeNode
{
// Index of first child node. To get to the 4th node,
// we just access nodes[first_child+3], e.g.
int first_child;
...
};
So anyway, I vote for the octree in this case if these are the two choices. Also for this type of proximity search, you don't necessarily want the octree to be too deep. Even if we have to look at more points than optimal with a shallower tree, that can be better than having to go up and down the tree a lot. It does help if the points you store in a leaf are contiguous. You can potentially achieve that with a post-process after you finish building your tree.
Note with both solutions that you do have to look at sibling nodes. The nearest point to a point is not necessarily the one residing in the same leaf node. There are also some cases where just a 3-dimensional grid can actually be quite optimal for this purpose, depending on the nature of your data, since with the 3D grid you never even have to bother to go from child to parent to sibling. 3D grids can seem explosive in memory use but they don't have to be necessarily if you reduce the memory overhead of a grid cell down to just a 32-bit index. In such a case, a 100x100x100 grid takes less than 4 megabytes.
I was interested in the same question before, and have done the following experiment for comparison.
- We take two point clouds (.pcd file, from a LiDAR sensor) as input. Target cloud size = 65536, source cloud size = 3000.
- For each point in the source cloud, find its neighbor point in the target cloud.
- The scale is roughly 10-50 meters, in a structured environment.
- Single threading on a 13-gen Intel i7 CPU laptop.
- The implementation of both data structures is taken from PCL and benchmark is conducted in a C++ program.
Operation K-D Tree Octree
Build Tree 7 ms 12 ms
K-NN Search (k=1) 124 ms 390 ms
K-NN Search (k=10) 124 ms 395 ms
K-NN Search (k=100) 134 ms 573 ms
Radius Search (r=0.01 m) 1789 ms 126 ms
Radius Search (r=0.1 m) 1799 ms 129 ms
Radius Search (r=1 m) 2011 ms 415 ms
Beside, I have done some ablation studies on this result.
- Time consumption of tree building is linear in target cloud size.
- Time consumption of each search method is also almost linear in source cloud size. (i.e. size = 30000 can roughly take 10x more time than size = 3000).
- The location and shape of source query cloud does not affect much on running time.
- PCL Point Type (PointXYZ vs. PointXYZRGBNormal) does not affect much on running time.
For 3D and fixed query radius, octrees are a good choice. If you need to work on disk, other data structures may be better, but the k-d-tree doesn't shine here either.
Why don't you try both and see which works better for your data?
In my project I am using an Octree for the range search and it works efficiently and is easy to implement. Never compared it to KD-Tree though. To my knowledge the worst case time complexity in kd trees for this operation is O(n^(2/3)) for three dimensional data, while Octree can only garantee O(n). So if you care about worst time complexity, choose KD Tree. (I dont care about worst time complexity, if I know in my data set this will never happen.)
© 2022 - 2024 — McMap. All rights reserved.