Python error when importing image_to_string from tesseract
Asked Answered
S

5

14

I recently used tesseract OCR with python and I kept getting an error when I was trying to import image_to_string from tesseract.

Code causing the problem:

# Perform OCR using tesseract-ocr library
from tesseract import image_to_string
image = Image.open('input-NEAREST.tif')
print image_to_string(image)

Error caused by above code:

Traceback (most recent call last):  
file "./captcha.py", line 52, in <module>  
from tesseract import image_to_string  
ImportError: cannot import name image_to_string

I've verified that the tesseract module is installed:

digital_alchemy@roaming-gnome /home $ pydoc modules | grep 'tesseract'
Hdf5StubImagePlugin _tesseract          gzip                sipconfig
ORBit               cairo               mako                tesseract

I believe that I've grabbed all the required packages but unfortunately I'm just stuck at this point. It appears that the function is not in the module.

Any help greatly appreciated.

Sandarac answered 1/2, 2013 at 5:47 Comment(2)
try "import tesseract.image_to_string", or even just "import tesseract".Songful
I think you have the wrong python bindings... What do you have in vars(tesseract)?Leakage
T
9

Another possibility that seems to have worked for me is to modify pytesseract so that instead of import Image it has from PIL import Image

Code that works in PyCharm after modifying pytesseract:

from pytesseract import image_to_string
from PIL import Image

im = Image.open(r'C:\Users\<user>\Downloads\dashboard-test.jpeg')
print(im)

print(image_to_string(im))

Pytesseract I installed via the package management built into PyCharm

Tootsie answered 18/6, 2014 at 17:20 Comment(3)
I get an error saying - OSError: [Errno 2] No such file or directory In File "/usr/lib/python2.7/subprocess.py", line 679, in init errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_childAndorra
@C.R.Sharat Yes, a long time ago. I don't remember what solved it. If it helps I am using PIL==1.1.7 pytesseract==0.1.6 Pillow==2.9.0 and I have installed sudo apt-get install python-opencv alsoAndorra
apt-get install tesseract-ocr # This may solve this issue, @C.R.SharatSprite
H
1

Is your syntax correct for the module you have installed? That image_to_string functions looks like it is from PyTesser per the usage example on this page: https://code.google.com/p/pytesser/

Your import looks like it is for python-tesseract which has a more complicated usage example listed: https://code.google.com/p/python-tesseract/

Heddy answered 1/2, 2013 at 5:56 Comment(0)
M
1

For windows followed below steps

pip3 install pytesseract 
pip3 install pillow

Installation of tessaract-ocr is also required https://github.com/tesseract-ocr/tesseract/wiki otherwise you will get an error Tessract is not on path

Python code

from PIL import Image
from pytesseract import image_to_string

print ( image_to_string(Image.open('test.tif'),lang='eng')  )
Mccune answered 15/8, 2018 at 2:51 Comment(0)
M
0

what works for me:

after I install the pytesseract form tesseract-ocr-setup-3.05.02-20180621.exe I add the line pytesseract.pytesseract.tesseract_cmd="C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe" and use the code form the above this is all the code:

import pytesseract
from PIL import Image

pytesseract.pytesseract.tesseract_cmd="C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe"
im=Image.open("C:\\Users\\<user>\\Desktop\\ro\\capt.png")
print(pytesseract.image_to_string(im,lang='eng'))

I am using windows 10 with PyCharm Community Edition 2018.2.3 x64

Mediative answered 13/9, 2018 at 10:16 Comment(0)
C
0

This worked for me:

import pytesseract
from PIL import Image

pytesseract.pytesseract.tesseract_cmd=r"C:\Program Files\Tesseract-OCR\tesseract.exe"
im=Image.open(r"C:\Users\path\1_python-ocr.jpg")
print(pytesseract.image_to_string(im,lang='eng'))
Cabalism answered 16/2 at 4:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.