Incomplete coordinate values for Google Vision OCR

About

Asked 7/9, 2016 at 20:55 Answered 7/9, 2016 at 21:2

I have a script that is iterating through images of different forms. When parsing the Google Vision Text detection response, I use the XY coordinates in the 'boundingPoly' for each text item to specifically look for data in different parts of the form.

The problem I'm having is that some of the responses come back with only an X coordinate. Example:

{u'description': u'sometext', u'boundingPoly': {u'vertices': [{u'x': 5595}, {u'x': 5717}, {u'y': 122, u'x': 5717}, {u'y': 122, u'x': 5595}

I've set a try/except (using python 2.7) to catch this issue, but it's always the same issue: KeyError: 'y'. I'm iterating through thousands of forms; so far it has happened to 10 rows out of 1000.

Has anyone had this issue before? Is there a fix other than attempting to re-submit the request if it reaches this error?

Flint answered 7/9, 2016 at 20:55 Comment(0)

From the docs:

boundingPoly

object(BoundingPoly)

The bounding polygon around the face. The coordinates of the bounding box are in the original image's scale, as returned in ImageParams. The bounding box is computed to "frame" the face in accordance with human expectations. It is based on the landmarker results. Note that one or more x and/or y coordinates may not be generated in the BoundingPoly (the polygon will be unbounded) if only a partial face appears in the image to be annotated.

I believe this is implying that the 'y' value in this case is 0, or more generally, an edge value. In other words, it doesn't know where the bounded poly truly ends, as the text goes all the way to the edge of the image, and thus the image doesn't give enough info to know for sure that the text actually ends there. As far as the image provides, it ends at 'y' of 0.

Thor answered 7/9, 2016 at 21:2 Comment(2)

This makes sense given where it occurs; the text that doesn't have a y coordinate is at the top of the image (which could have a y of 0). – Flint 7/9, 2016 at 21:24

How is one supposed to know what direction the polygon is unbounded in (i.e. what edge are the missing coordinates on)? – Islamite 19/3 at 14:42

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags