Use of UnstructuredPDFLoader unstructured package not found, please install it with `pip install unstructured
Asked Answered
L

3

12

I just have a newly created Environment in Anaconda (conda 22.9.0 and Python 3.10.10). Then I proceed to install langchain (pip install langchain if I try conda install langchain it does not work). According to the quickstart guide I have to install one model provider so I install openai (pip install openai).

Then I enter to the python console and try to load a PDF using the class UnstructuredPDFLoader and I get the following error. What the problem could be?

(langchain) C:\Users\user>python
Python 3.10.10 | packaged by Anaconda, Inc. | (main, Mar 21 2023, 18:39:17) [MSC v.1916 64 bit (AMD64)] on win32
>>> from langchain.document_loaders import UnstructuredPDFLoader
>>> loader = UnstructuredPDFLoader("C:\\<path-to-data>\\data\\name-of-file.pdf")
Traceback (most recent call last):
  File "C:\<path-to-anaconda>\envs\langchain\lib\site-packages\langchain\document_loaders\unstructured.py", line 32, in __init__
    import unstructured  # noqa:F401
ModuleNotFoundError: No module named 'unstructured'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\<path-to-anaconda>\envs\langchain\lib\site-packages\langchain\document_loaders\unstructured.py", line 90, in __init__
    super().__init__(mode=mode, **unstructured_kwargs)
  File "C:\<path-to-anaconda>\envs\langchain\lib\site-packages\langchain\document_loaders\unstructured.py", line 34, in __init__
    raise ValueError(
ValueError: unstructured package not found, please install it with `pip install unstructured`
Loosetongued answered 28/3, 2023 at 8:44 Comment(2)
PDF Loaders from LangChain. If unstructured gives you a hard time, try PyPDFLoader. from langchain.document_loaders import UnstructuredPDFLoader, OnlinePDFLoader, PyPDFLoaderTidings
I have the same problem with it. I installed everything they listed. Same for BS4. I think there's an issue with langchain.Hbomb
S
8

Run this pip install unstructured or this pip install "unstructured[local-inference]"

Samford answered 28/3, 2023 at 9:0 Comment(0)
M
4

Run this:

pip install unstructured==0.5.6
Malapropos answered 3/7, 2023 at 17:48 Comment(1)
i installed it without any problem with pip install unstructured==0.5.6 --userVenenose
C
1

There is a page in LangChain docs for installing dependencies. The one that author mentioned was not the only one I was missing https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/unstructured_file.html

Cornela answered 3/4, 2023 at 10:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.