Limit Detection Area in Vision API
Asked Answered
I

1

16

It seems I've found myself in the deep weeds of the Google Vision API for barcode scanning. Perhaps my mind is a bit fried after looking at all sorts of alternative libraries (ZBar, ZXing, and even some for-cost third party implementations), but I'm having some difficulty finding any information on where I can implement some sort of scan region limiting.

The use case is a pretty simple one: if I'm a user pointing my phone at a box with multiple barcodes of the same type (think shipping labels here), I want to explicitly point some little viewfinder or alignment straight-edge on the screen at exactly the thing I'm trying to capture, without having to worry about anything outside that area of interest giving me some scan results I don't want.

The above case is handled in most other Android libraries I've seen, taking in either a Rect with relative or absolute coordinates, and this is also a part of iOS' AVCapture metadata results system (it uses a relative CGRect, but really the same concept).

I've dug pretty deep into the sample app for the barcode-reader here, but the implementation is a tad opaque to get anything but the high level implementation details down.

It seems an ugly patch to, on successful detection of a barcode anywhere within the camera's preview frame, to simple no-op on barcodes outside of an area of interest, since the device is still working hard to compute those frames.

Am I missing something very simple and obvious on this one? Any ideas on a way to implement this cleanly, otherwise?

Many thanks for your time in reading through this!

Intima answered 18/2, 2016 at 18:31 Comment(0)
C
3

The API currently does not have an option to limit the detection area. But you could crop the preview image before it gets passed into the barcode detector. See here for an outline of how to wrap a detector with your own class:

Mobile Vision API - concatenate new detector object to continue frame processing

You'd implement the "detect" method to take the frame received from the camera, create a cropped version of the frame, and pass that through to the underlying detector.

Crapulent answered 19/2, 2016 at 16:0 Comment(5)
Is the Google Mobile Vision API free? I need to use it for face detection.There is no pricing info nor a direct means to contact support on the website.Kristikristian
Thanks for the update on this. Stinks to hear there's no direct implementation to hook into for this, but I took your advice and have a functioning version of a cropped scanning area. I've got two versions, as well - a true crop of a bitmap that gets sent to the detection client, and a simple, "Does Rect X contain a vertex of Scanning Result Rect Y?" in the callback of the sample graphic factory.Intima
@Crapulent i am working on OCR app. i need to capture text from the area that is Rect(0, 306 - 720, 489) and ignore the rest image. I tried modifying the frame in the ocr-reader app, mDetector.receiveFrame(f) in CameraSource but text is not detected. Can you please give a working exampleBetulaceous
@TikiMcFee can you add your cropped solution here as an answer. It will help. Many people like meGrounds
@Grounds - I'm sorry this is 5 years late, but I keep getting little bumps that folks are looking for this. For you and the others, I cannot post the code - it's locked up in an enterprise app solution and not open source. The idea, however, is to essentially just capture the photo, and in any way you can, crop the result with a literal width/height limit of whatever you want. You could use the OS intrinsics for Bitmap manipulation or otherwise, but that'll do the trick. developer.android.com/reference/android/graphics/Bitmap Hopefully that's a good enough starting point!Intima

© 2022 - 2024 — McMap. All rights reserved.