How to compare two shapes?
Asked Answered
P

3

12

Is there a way to compare two geometric shapes (or any two more generic data structures), without using the brute force when a tolerance is involved?

The brute force (that is comparing each value of each object against each value of the other object) works but it's slow, and I can't use it.

I tried sorting the data and comparing two sorted collections. It's fast, but it only works with zero tolerance. As soon as I add the tolerance I get lost. The problem is that two values can be identical when I compare and different when I sort.

Here are some details of my problem.

In my Excel VBA add-in I have a collection of Shape objects made by a collection of Line objects made by two Point objects each. The add-in scans a CAD drawing via COM and creates the collection of Shape objects.

An simplified version could generate this:

            Shape 1     Shape 2
Point 1    0.0  5.0    0.0  4.9
Point 2    4.9  0.0    5.1  0.0
Point 3    5.0  5.0    5.0  5.0

I need to find which shapes are identical to which shapes, where identical means has the same shape, size and orientation, but not the same position (so far it's trivial) plus or minus a tolerance (not so trivial now!)

The Point.IsIdenticalTo(OtherPoint) is defined as:

Function IsIdenticalTo(OtherPoint As Point) As Boolean
  IsIdenticalTo = Abs(X - OtherPoint.X) < Tolerance And Abs(Y - OtherPoint.Y) < Tolerance
End Function

The brute force implementation of the Shape.IsIdenticalTo(OtherShape) works but it's too slow: if each Line(I) has an identical OtherShape.Line(J) and viceversa, then the two shapes are identical. Sometimes I have hundreds of shapes with hundreds of lines each, so the brute force solution doesn't work for me.

I tried two approaches involving sorted collections. Both are fast because comparing two sorted collections is faster than the brute force way, but both fail in some conditions:

  1. A SortedValues collection contains all the X and Y values of all the lines. The values are sorted, so the info about whether a value is an X or a Y is lost. I have used this approach for months without problems, but it fails for example when the only difference between two shapes is between the points (10, 20) and (20, 10). I added the line angle to the list of values, things have improved, but there are still cases where this approach fails, because some info is lost when the values are sorted together. In the example above this approach would work with the following collections:

    Shape 1     Shape 2
      0.0         0.0
      0.0         0.0
      4.9         4.9
      5.0         5.0
      5.0         5.0
      5.0         5.1
    
  2. A SortedLines collection contains all the lines sorted counter-clockwise and starting from the point closest to the origin. This approach doesn't lose any info, but it fails in the example above because the sorting algorithm doesn't agree with the equality comparison. If the tolerance is 0.5 they should be identical, but the sorting algorithm produces the collections shown below. Things get more difficult because my shapes contain sub-shapes, so there are many starting points on each shape.

            Shape 1     Shape 2
    Point 1    4.9  0.0    0.0  4.9
    Point 2    5.0  5.0    5.1  0.0
    Point 3    0.0  5.0    5.0  5.0
    

EDIT:

Shapes are imported from an external graphical application via COM. A shape can be as simple as rectangle or as complex as any fancy outline with 10 circles inside, 20 internal shapes and 30 lines. They represent panels with holes and simple decorations, and sometimes they have a saw-tooth shape, which makes dozen of edges.

Pigskin answered 3/3, 2014 at 23:38 Comment(7)
It's not clear to me whether the "shapes" you're comparing are actual shapes on a worksheet, or "shape objects" instantiated from some class you have in your VBA project. What exactly are the objects you're comparing? You could probably speed up your comparison a little by only comparing y values if the x values are comparable.Insufficient
@TimWilliams Shapes are geometrical objects imported from a CAD drawing and instantiated from my own VBA classes. A shape can be a simple rectangle, or a 20 point star with 30 holes and 40 lines crossing it. If you know AutoCad think of any line inside a block (I'm not using AutoCad, but it's similar). They are my classes, so great flexibility, but I can't use external libraries... unless I can?Pigskin
Have you tried a nearest neighbor based algorithm?Brentonbrentt
Well, I don't know autoCAD, but it seems like there might be some avenues to improve performance, such as sorting: even sorting on a single property should allow you to quickly exclude a good number of possible comparison candidates. Also nesting your property comparison checks so you can "fail early" rather than check every pair of values in a single "and" statement.Insufficient
@NiklasB. I don't know how to use a nearest neighbor with a multi-dimensional problem. I tried to reduce it to a one-dimensional problem, but I lose some info and that's my problemPigskin
@stenci: If you can use manhattan distance (which would probably be a good idea, you can find for every point of shape 1 the nearest point in shape 2. A problem might be that you assign points multiple times, but I guess that's not a problem because the "shape" is the same (even if there is not a 1:1 assignment of vertices). If you want a 1:1 assignment, you need to preprocess shape 1 in the case that two vertices delta-environments intersectBrentonbrentt
Since they have the same orientation, I'd start by computing bounding boxes, find the center, and translate the center to the origin. You can easily compare bounding boxes with tolerance, which should discard most of the possible matches. Then you can use a more detailed approach to compare what's left. That doesn't reduce the theoretical worst case (all shapes are equal), but it probably makes a huge difference for real world data.Brianabriand
S
12
  1. handle shape as polygon

    convert your points (each line) to set of lines (length,angle) like on this image:

    polygon representation

    this ensures invariance on rotation/translation. If you see more lines with angle=PI join them together to avoid miss comparisons of the same shapes with different sampling also try to match the same CW/CCW polygon winding rule for both shapes.

  2. find start point

    Can be biggest or smallest angle, length ... or specific order of angles+lengths. So reorder lines of one polygon (cyclic shift) so your shapes are compared from the 'same point' if they can.

  3. comparison - for exact match

    • number of lines have to be the same
    • perimeters must be the same +/- some accuracy

    so for example:

    fabs (sum of all lengths of poly1 - sum of all lengths of poly2) <= 1e-3
    

    if not shapes are different. Then compare all lengths and angles. If any one value differs more then accuracy value then shapes are different.

  4. comparison - size does not matter

    compute perimeter of both polygons l1,l2 and resize all lengths of compared poly2 to match perimeter of poly1 so all lengths of poly2 are multiplied by value = l1/l2;. After this use comparison from bullet #3

  5. comparison - shape deviations can still do positive match (size must be the same)

    try to set the number of lines to the same value (join all lines with angle close to PI). Then perimeters should "match" ... fabs(l1-l2)<=1e-3*l1. You can use bullet #4 comparison

  6. comparison - size and shape deviations can still match

    just resize poly2 to match perimeter of poly1 as in bullet #4 and then use bullet #5

If you can not find the start point in booth polygons (bullet #2)

Then you have to check for all start point shifts so if your polygons have booth 5 lines:

    poly1: l11,l12,l13,l14,l15
    poly2: l21,l22,l23,l24,l25

Then you have to compare all 5 combinations (unless you found match sooner):

    cmp (l11,l12,l13,l14,l15),(l21,l22,l23,l24,l25)
    cmp (l11,l12,l13,l14,l15),(l22,l23,l24,l25,l21)
    cmp (l11,l12,l13,l14,l15),(l22,l23,l24,l25,l21)
    cmp (l11,l12,l13,l14,l15),(l23,l24,l25,l21,l22)
    cmp (l11,l12,l13,l14,l15),(l24,l25,l21,l22,l23)
    cmp (l11,l12,l13,l14,l15),(l25,l21,l22,l23,l24)

[Notes]

  1. There are also faster ways to compare but they can miss in some cases

    • you can compare histograms of lines, angles
    • you can use neural network (I do not like them but they are ideal for classifications like this)
  2. if your shapes have to be oriented in the same ways (no rotation invariance)

    then instead of vertex angle use the line direction angle

  3. if you can not ensure the same winding rule for both compared polygons

    then you have to check them booth:

    cmp (l11,l12,l13,l14,l15),(l21,l22,l23,l24,l25)
    cmp (l11,l12,l13,l14,l15),(l25,l24,l23,l22,l21)
    

I know it is a bit vague answer but still hope it helps at least a little ...

Superphysical answered 4/3, 2014 at 7:54 Comment(6)
My problem is bullet 2: finding the start point. Look at the triangle in my question: the two shapes are "identical", but after the sort they start from two different points. Comparing all the combinations could be very slow with hundreds of segments. I could try to compare only the combinations that start inside a certain range, but things would get complicated when the shape has sub-shapes (holes which can be closed and decorations which can be open), and each could have mismatching starting point/orientation. The problem derives from the fact that I use the tolerance to compare, not to sortPigskin
My current implementation (comparing three sorted vectors, one with all the X, one with all the Y and one with all the Angles) solves the problem of the starting point, but fails with some weird symmetric shapes, where points (1,2) and (2,1) are both used in both the shapes, but in different positions.Pigskin
do not sort your vertexes leave them in their original order !!! because order of vertexes also defines shape (after sorting points you loose this information). All combinations for comparison of two 100-vertex polygons is only 100 not 100x100 so that is not that slow... also the most slow part is length and angle computation which can be done just once per all combinations so even if you do not know the start point it should not be a big deal. single comparison is around O(N)...Superphysical
all reorder of vertexes means just cyclic shifting the order of all vertexes at once so next and previous vertexes are still the same. ... also CPU time spend for your tree creation and sorting is probably bigger then the comparison itselfSuperphysical
That's right, it's O(N), not O(N^2), so it should work. I will need to add some acrobatic trick to manage sub-shapes, but I can do it all in the good old VBA. Thanks!Pigskin
O(N) is maximal value (for the same object) and known start point ... usually the mismatch stops much sooner ...Superphysical
T
2

I am not sure how do you want to solve this problem. you want to go deep or you just want a solution. I can suggest you use an OpenCV function called "matchShapes". This function is based on Hu moments and has a good performance for rigid shapes. After you extract the target and the main contours then use the below code to compare them.

dif = cv.matchShapes(Contour1, Contour2, 1, 0, 0)

Smaller "dif" value means more similarity between contours.

Toxinantitoxin answered 24/5, 2020 at 9:57 Comment(7)
What do you mean by extract? Will this handle shapes with hundreds of holes and open contours? Can OpenCV be fed with shapes generated as cad drawings?Pigskin
Hi, No you have to prepare your final contours (preprocessing, noise reducing, hole removal, ..., you have to do them first) and your best contour (Target's contour) to check your inputs with it before using this function. You may have hundreds of contours on one side as inputs and your target's contour on the other side. Each time you can only compare two individual contours. I hope this explanation was more clear.Toxinantitoxin
I need to find out if two shapes with 50 holes with diameter 20mm differ in the diameter of one of the holes by +-0.01 mm, or if one of the corners is 83.5 in one shape and 83.49 in the other. The tolerance is +-0.001 mm and 0.001 degrees, if I understand your system is not precise enough.Pigskin
Sorry for my late reply. The method is not useful for your application. You do not need a comparison method. You have to do the basic contour measurements.Toxinantitoxin
Sorry for the late reply. The method isn't useful for your application. You don't need a comparison method. You have to do the basic contour measurements. This level of precision you need to have reliable input images. Your picture probably comes from a fixed camera and a fixed object. Therefore localization is not a problem for you. You need to do grab each hole and do the needed calculation. Challenge will be illumination and camera quality to get the perfect input to have an accurate result. I can not cover all aspects and you must do many tests to reach the best acquisition systemToxinantitoxin
There are no images, pictures or cameras involved here. There are CAD drawings (sorry if it wasn't clear, it was specified in the original question).Pigskin
In my mind, there is always a camera on the project. :) I am not sure that OpenCV can directly read and change the format to OpenCV format or not. It would not be a big issue while you can solve this issue with a format convertor APIs. You can handle any type or style of drawings by OpenCV, it is not an issue. If you share a few samples then we can discuss it in detail. :)Toxinantitoxin
O
1

I have the same problem. I compute the adjacent matrix of the vertex weighted with the distances. This compute all the sides length and diagonals. Then if the module of each row or column of the matrix are the same with the other matrix, then the two shapes are the same. For the tolerance just use the function round() before start. The complexity is O(n2 / 2), because you have to compute just an half of the adjacent matrix that is symmetric. The problem is that I cannot detect if a shape is flipped.

Oeildeboeuf answered 17/2, 2016 at 9:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.