I'm using the popular OCR tessercat fork for android tess-two https://github.com/rmtheis/tess-two. I integrated all the staff and it works etc...
But I need to detect only digits, my code for now is:
TessBaseAPI baseApi = new TessBaseAPI();
baseApi.init(pathToLngFile, langName);
baseApi.setImage(bitmap);
String recognizedText = baseApi.getUTF8Text();
baseApi.end();
doSomething(recognizedText);
From here https://code.google.com/p/tesseract-ocr/wiki/FAQ#How_do_I_recognize_only_digits?
I'm using version V3, and there ain't code solution instead some command line solution - not relevant for android project (I think...). So I tried to implement the solution for version < V3 and add this line:
baseApi.SetVariable("tessedit_char_whitelist", "0123456789");
My question is what to do with the init()? I don't need any language, but still I need to init & aint init() method...
EDIT: To be more specific
My end goal is plain document (not pure Excel sheet), that looks like the attached picture (header & 3 columns separated by white spaces).
My requirements is to make sense in the digits: To be able to separate and determine which digits belong to which row and column.
Thanks,