Efficient algorithm for finding spheres farthest apart in large collection
Asked Answered
I

4

18

I've got a collection of 10000 - 100000 spheres, and I need to find the ones farthest apart.

One simple way to do this is to simply compare all the spheres to each other and store the biggest distance, but this feels like a real resource hog of an algorithm.

The Spheres are stored in the following way:

Sphere (float x, float y, float z, float radius);

The method Sphere::distanceTo(Sphere &s) returns the distance between the two center points of the spheres.

Example:

Sphere *spheres;
float biggestDistance;

for (int i = 0; i < nOfSpheres; i++) {
    for (int j = 0; j < nOfSpheres; j++) {
        if (spheres[i].distanceTo(spheres[j]) > biggestDistance) {
            biggestDistance = spheres[i].distanceTo(spheres[j]) > biggestDistance;
        }
    }
}

What I'm looking for is an algorithm that somehow loops through all the possible combinations in a smarter way, if there is any.

The project is written in C++ (which it has to be), so any solutions that only work in languages other than C/C++ are of less interest.

Ilyse answered 16/2, 2010 at 21:28 Comment(9)
Just to clarify, you want something that is better than O(n^2) (which is bad). Hmmm.....Ehlers
Does distanceTo() obey the triangle inequality? is distanceTo() center to center or surface to surface? Do the spheres have coordinates in a vector space? Is it in a euclidean space or are there black holes around curving the metric? Where did you get all those spheres, anyway? Is there anything interesting inside the spheres that could be sold on eBay, because you seem to have a lot of them?Unit
@Ehlers If they were on the real line it would be n log(n) to sort but only O(n) to find the min and max coordinate. That assumes that radius or surface is not an issueUnit
A BFS search provides O(b^d) complexity or O(|E| + |V|)Causality
@Mark: Yes, it's homework. How did you guess that? (Besides the fact that CS teachers loooove big collections of data that gets out of hand.) @Craig: Yes, I know there are a few improvements that can be made - but I've got a feeling that I'm missing something. =P @Paul: Lol, I think the inside of the spheres are full of void. If you want to buy some of it I'm sure we can arrange something. ;-)Ilyse
@illuzive: It looks a lot like homework. It's OK to ask for help with programming related homework here, but a StackOverflow best-practice is to tag homework questions with the homework tag.Gapeworm
@Mark: Ahh,I see. I'm new here so I didn't know that. I'll make sure to tag it correctly in the future. Thanks for the info. =)Ilyse
This would be a WAY better homework question if Sphere::distanceTo(Sphere &s) returned the min distance between points on the two spheres rather than the distance between the centers. Why have spheres at all?Heronry
This is only a small part of a bigger assignment, where the spheres are used in different ways.Ilyse
M
33

The largest distance between any two points in a set S of points is called the diameter. Finding the diameter of a set of points is a well-known problem in computational geometry. In general, there are two steps here:

  1. Find the three-dimensional convex hull composed of the center of each sphere -- say, using the quickhull implementation in CGAL.

  2. Find the points on the hull that are farthest apart. (Two points on the interior of the hull cannot be part of the diameter, or otherwise they would be on the hull, which is a contradiction.)

With quickhull, you can do the first step in O(n log n) in the average case and O(n2) worst-case running time. (In practice, quickhull significantly outperforms all other known algorithms.) It is possible to guarantee a better worst-case bound if you can guarantee certain properties about the ordering of the spheres, but that is a different topic.

The second step can be done in Ω(h log h), where h is the number of points on the hull. In the worst case, h = n (every point is on the hull), but that's pretty unlikely if you have thousands of random spheres. In general, h will be much smaller than n. Here's an overview of this method.

Masseuse answered 16/2, 2010 at 21:45 Comment(4)
+1: I have deleted my post after seeing this one as this one is a superset and better answer.Peon
I would have been more descriptive about the steps, but this is homework. However, CGAL provides source code, so looking at that is a pretty good start to understanding what's happening here.Masseuse
After some brief Googling, I thought I would add this link as well: personal.kent.edu/~rmuhamma/Compgeometry/MyCG/ConvexHull/…Lotetgaronne
@JohnFeminella - that overview link appears to be dead. I'm working on a similar problem which led me here and I clicked that link. :-(Vesture
L
3

Could you perhaps store these spheres in a BSP Tree? If that's acceptable, then you could start by looking for nodes of the tree containing spheres which are furthest apart. Then you can continue down the tree until you get to individual spheres.

Lotetgaronne answered 16/2, 2010 at 21:34 Comment(1)
I've thought about a tree structure too, but I haven't looked into the BSP Tree yet, thanks for your reply. =)Ilyse
C
2

Your problem looks like something that could be solved using graphs. Since the distance from Sphere A to Sphere B is the same as the distance from Sphere B to Sphere A, you can minimize the number of comparisons you have to make.

I think what you're looking at here is known as an Adjacency List. You can either build one up, and then traverse that to find the longest distance.

Another approach you can use will still give you an O(n^2) but you can minimize the number of comparisons you have to make. You can store the result of your calculation into a hash table where the key is the name of the edge (so AB would hold the length from A to B). Before you perform your distance calculation, check to see if AB or BA exists in the hash table.

EDIT

Using the adjacency-list method (which is basically a Breadth-First Search) you get O(b^d) or worst-case O(|E| + |V|) complexity.

Causality answered 16/2, 2010 at 21:38 Comment(2)
Really, I had no idea that a BFS was such an inefficient algorithm. My eyes have been opened.Causality
Building the adjacency list takes as much time as the brute force solution.Laundes
E
2

Paul got my brain thinking and you can optimize a bit by changing

for (int j=0; j < nOfSpheres; j++) 

to

for (int j=i+1;  j < nOfSpheres; j++) 

You don't need to compare sphere A to B AND B to A. This will make the search O(n log n).

--- Addition -------

Another thing that makes this calculation expensive is the DistanceTo calulations.

distance = sqrt((x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2)

That's alot of math. You can trim that down by checking to see if

((x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2 > maxdist^2

Removes the sqrt until the end.

Ehlers answered 16/2, 2010 at 21:42 Comment(4)
No, this is still O(n^2), but it cuts the number of checks in half (was about to post this too).Dar
Yeah, I know. sqrt is expensive.Ilyse
I wondered if taking principal components helps if you have coordinates in some space of dim N>1.Unit
You shouldn't need to check against maxdist^2 since the squared distance will already be what's stored in maxdist.Lotetgaronne

© 2022 - 2024 — McMap. All rights reserved.