How to add a Spacy model to a requirements.txt file?
Asked Answered
O

3

31

I have an app that uses the Spacy model "en_core_web_sm". I have tested the app on my local machine and it works fine.

However when I deploy it to Heroku, it gives me this error:

"Can't find model 'en_core_web_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory."

My requirements file contains spacy==2.2.4.

I have been doing some research on this error and found that the model needs to be downloaded separately using this command: python -m spacy download en_core_web_sm

I have been looking for ways to add the same to my requirements.txt file but haven't been able to find one that works!

I tried this as well - added the below to the requirements file:

-e git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz#egg=en_core_web_sm==2.2.0

but it gave this error:

"Cloning git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz to /app/.heroku/src/en-core-web-sm

Running command git clone -q git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz /app/.heroku/src/en-core-web-sm fatal: remote error: explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz is not a valid repository name"

Is there a way to get this Spacy model to load from the requirements file? Or any other fix that is possible?

Thank you.

Outsize answered 9/5, 2020 at 19:15 Comment(4)
You're getting that error because that's an URL to a zip file... You need to pass an URL to a repository for git to be able to clone it...Weariful
Thanks Swetank, I'm not able to figure out what that url would be. Would you be able to help please? Thank you so much in advance.Outsize
The answer below has been edited to answer your question! :DWeariful
Thanks Swetank, the edited answer still gives an error: " Cloning git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz to /tmp/pip-req-build-at911nv7 Running command git clone -q git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz /tmp/pip-req-build-at911nv7 fatal: remote error: explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz is not a valid repository name"Outsize
O
7

Ok, so after some more Googling and hunting for a solution, I found this solution that worked:

I downloaded the tarball from the url that @tausif shared in his answer, to my local system.

Saved it in the directory which had my requirements.txt file.

Then I added this line to my requirements.txt file: ./en_core_web_sm-2.2.5.tar.gz

Proceeded with deploying to Heroku - it succeeded and the app works perfectly now.

Outsize answered 10/5, 2020 at 7:8 Comment(2)
check edit of my answer, that may be more cleaner way to do it if works.Rheinlander
Thank you so much Tausif, will test the latest edit to your answer in my next edit of the app and will revert here accordingly.Outsize
R
29

Add this in your deployment step, if using docker add in Dockerfile

pip3 install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz --user

EDIT

Add

spacy>=2.2.0,<3.0.0 https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz#egg=en_core_web_sm

in requirements.txt

Spacy Doc Refer Downloading and requiring model dependencies section

For more detail on how to add github-source see this and follow YPCrumble answer

Rheinlander answered 9/5, 2020 at 19:28 Comment(2)
Thanks Tausif, is there a way I can add this to the requirements file? I'm not using docker.Outsize
Thanks Tausif, it still gives an error: " Cloning git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz to /tmp/pip-req-build-at911nv7 Running command git clone -q git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz /tmp/pip-req-build-at911nv7 fatal: remote error: explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz is not a valid repository name"Outsize
F
16

For en-core-web-sm == 3.0.0, this worked for me.

Replace the line "en-core-web-sm==3.0.0" with

en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.0.0/en_core_web_sm-3.0.0-py3-none-any.whl
Flavorsome answered 9/6, 2021 at 19:39 Comment(1)
this is the accepted answer, works 100%Indelicacy
O
7

Ok, so after some more Googling and hunting for a solution, I found this solution that worked:

I downloaded the tarball from the url that @tausif shared in his answer, to my local system.

Saved it in the directory which had my requirements.txt file.

Then I added this line to my requirements.txt file: ./en_core_web_sm-2.2.5.tar.gz

Proceeded with deploying to Heroku - it succeeded and the app works perfectly now.

Outsize answered 10/5, 2020 at 7:8 Comment(2)
check edit of my answer, that may be more cleaner way to do it if works.Rheinlander
Thank you so much Tausif, will test the latest edit to your answer in my next edit of the app and will revert here accordingly.Outsize

© 2022 - 2024 — McMap. All rights reserved.