Removing duplicates of 3D points in a vector in C++

Asked 27/12, 2015 at 14:5 Answered 27/12, 2015 at 14:34

Solved c++point-cloud-library point-clouds

I am dealing with a point cloud, i.e. a vector of points, as a result of computation, which contains duplicate points (up to 10% of the size of the cloud).

My implementation was to sort those points according to x,y, and z value and then use the std::unique function. The resulting cloud however still contains duplicates, even though the sorting itself seems to be working.

Here's the crucial code

bool comparePoint(pcl::PointXYZINormal p1, pcl::PointXYZINormal p2){
if (p1.x != p2.x)
    return p1.x > p2.x;
else if (p1.y != p2.y)
    return  p1.y > p2.y;
else
    return p1.z > p2.z;
}

bool equalPoint(pcl::PointXYZINormal p1, pcl::PointXYZINormal p2){
    if (p1.x == p2.x && p1.y == p2.y && p1.z == p2.z)
        return true;
    return false;
}
void KDsearch::cullDuplePoints(){
    std::sort(points.begin(), points.end(), comparePoint);
    std::unique(points.begin(), points.end(), equalPoint);
}

And here is a partial extraction of the output pointcloud (x,y, and z coordinates):

1.96828 -535.09515 2794.8391
1.96627 -636.95264 2914.0366
1.96627 -636.95264 2914.0366
1.9651 108.77433 2350.9841
1.9651 108.77433 2350.9841
1.9642299 -206.19427 5618.4629
1.9642299 -206.19427 5618.4629
1.96386 -1880.3784 1346.0654

So is unique not working properly or is there a mistake in my equal condition?

The points itself also contain normal coordinates, but they are not important for culling, so I did not use them in the code.

Menstrual answered 27/12, 2015 at 14:5 Comment(5)

Can you show us the definition of PointXYZNormal please? – Whitsuntide 27/12, 2015 at 14:17

Note that the comparison function for std::sort should check that a is less than b. If you're using C++11, you can also use std::tuple to perform the lexicographical comparison in a tidy way: stackoverflow.com/questions/6218812 – Auburta 27/12, 2015 at 15:36

The comparison function doesn't need to check for less than, provided it is a strict weak ordering (which this is), it's fine. – Nievesniflheim 27/12, 2015 at 15:58

More than you probably wanted to know about comparing floating point values, but you should have a peek nonetheless as otherwise comparing floating point values obtained through computation is an exercise in frustration (it always gets subtly wrong in some situations). – Disaccredit 27/12, 2015 at 17:14

How did you compare the points so that equal points are next to each other? Rather what is your logic behind it? Never mind its pretty simple im just tired.=( – Horologist 17/9, 2017 at 4:40

std::unique doesn't remove anything, it only moves elements around and returns an iterator "past the end" of the unique interval in the modified collection.
(The actual contents of the collection past the returned iterator is unspecified.)

You need to remove the duplicates explicitly:

std::sort(points.begin(), points.end(), comparePoint);
auto unique_end = std::unique(points.begin(), points.end(), equalPoint);
points.erase(unique_end, points.end());

You also need to be careful about floating point comparisons.

Luteous answered 27/12, 2015 at 14:34 Comment(2)

Great solution. Also, std::unique only removes the last occurrence of a value in comparison to the first. This is why the sort function is used, so that unique can see the consecutive values. – Gailey 27/12, 2015 at 14:47

How do you compare points so that equal points are next to each other? A set might be a better alternative if you cant figure this out. – Horologist 17/9, 2017 at 4:37

Your problem is that comparing floating point numbers for equality is always a difficult exercise. You will probably find that your points are (for example) actually:

1.965100000001 108.77433 2350.9841
1.965099999999 108.77433 2350.9841

... and those aren't equal.

If you want to treat points as "equal" if they are within 0.00001 of each other, you get the problem that your "equality" condition is not transitive. (0.0000,0,0) is "close" to (0.000009999,0,0) and to (-0.00009999,0,0), but those latter two points are "far" from each other. This is a hard problem to solve in general. Good luck!

If you know something about the values of the coordinates (for example, that they are in millimetres, and values are accurate to 100 nanometres), you could round to the nearest 100 nm, and store as long long. So:

struct IntPoint {
   const static double ScaleFactor = 10000;
   long long x,y,z;
   IntPoint(const pcl::PointXYZINormal &p)
      : x(llround(p.x*ScaleFactor ))
      , y(llround(p.y*ScaleFactor ))
      , z(llround(p.z*ScaleFactor ))
    {}
};

Convert your point cloud to IntPoint, and then your sort + unique (+erase), should work.

Nievesniflheim answered 27/12, 2015 at 14:13 Comment(1)

Thanks to Peter Mortensen for "eg" -> "for example" (a good habit for English writers, and easier for non-native English readers too). – Nievesniflheim 27/12, 2015 at 18:3

To erase the duplicates: you could do:

sort(point.begin(), point.end());
point.erase(unique(point.begin(), point.end()), point.end());

or just create a set, which by definition contains only unique elements, out of the vector's elements:

std::set<type> all_unique(point.begin(), point.end());

To compare floating point numbers: considering the mathematical properties¹ of the floating point numbers, together with the problems² inherited by their machine binary representation you end up with only one solution when comparing floating point numbers, namely, comparing them within a value of accuracy, epsilon.

Thus, if you want to compare and order float x1 and float x2, which are your coordinates, you do it by:

x1 - x2 < epsilon

where epsilon is the accuracy you are looking for. In your case, just for illustration, the function equalPoint() could be modified to:

bool equalPoint(pcl::PointXYZINormal p1, pcl::PointXYZINormal p2){
    // equal up to the third digit after the decimal point
    float eps = 0.001;

    if ((p1.x -p2.x) < eps && (p1.y - p2.y) < eps && (p1.z - p2.z) < eps)
        return true;

    return false;
}

^{1. They can differ by a very small quantity, unlike the integers, that are rounded and which could be easily compared.}

^{2. The computers doesn't map perfectly the real floating point numbers, the result of this fact is expressed in truncation, rounding.}

Oversleep answered 27/12, 2015 at 14:30 Comment(2)

Err, uncountability refers to the cardinality of the set of reals. This is simply an issue of rounding IEEE754 floats. – Pentode 27/12, 2015 at 14:37

@Pentode You are perfectly right and that is why I've added the first clarification on the bottom, to show that they can differ by very small quantities. – Oversleep 27/12, 2015 at 14:42

Recommended topics

Hot tags