TL;DR: How do I install tessdata_best
to use withpytesseract
inside conda
in Ubuntu 18
?
I have been using pytesseract
inside conda
environment for quite some but there is a need to improve the accuracy and I found out that tessdata_best
gives you the best accuracy. How can I install and use that version? I am using Ubuntu 18
and have to work with pytesseract
.
I have my tesseract
installed at /usr/share/tesseract-ocr/
and inside it there is only 1 tessdata
.
Do I need to get the tessdata_best
from github by copying it to the directory /usr/share/tesseract-ocr/
alongside tessdata
?
Even then, if I want to use tessdata-best
, what do I have to use? Do I need to change the config
as --oem 0/1/2/3
?
Third and last thing is that I have my language.trainedata
files at /home/deshwal/anaconda3/envs/py36/share/tessdata/eng.traineddata
. Do I need to paste the tessdata_best
at this location too? Becuse when I try to change the language dir, it gives me error as as:
/home/deshwal/anaconda3/envs/py36/share/tessdata/equ.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'equ\' Tesseract couldn\'t load any languages! Could not initialize tesseract.'