spaCy needs a file that is not there: strings.json
Asked Answered
E

2

6

I am running pytextrank were in its second stage, I get this error from spaCy:

File "C:\Anaconda3\lib\pathlib.py", line 371, in wrapped return strfunc(str(pathobj), *args)

FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Anaconda3\\lib\\site-packages\\spacy\\data\\en\\vocab\\strings.json'

I looked for strings.json but there is no such thing.

The interesting thing is that similar error with pathlib.py existed when I installed spaCy with the following error code:

OSError: Symbolic link privilege not held

Do you guys have any idea ? Thanks

Exhale answered 26/3, 2017 at 2:0 Comment(3)
spacy is broken?Centering
I have no idea, apparently, the 'spacy/vocab.pyx' calls for 'strings,json'Exhale
FWIW, we're adding support in pytextrank for installing directly through Anaconda -- currently it has PyPi support.Boarhound
B
9

Finallly, I can answer question in stackoverflow. I occurred same problem but solved it eventually. Here is my suggestion:

1. Download spaCy model from python -m spacy or github

both way are very convenient.

1). from python spacy:

python3 -m spacy download en

assume you are using python3+, the can be done automatically and generate new packages of model, which you can import via import en or using spacy.load('en')

2). from github

transfer link, selet the newest version and download it.

2. (if you not using python -m way then you want manually link the model) Link your downloaded model

this is the most important part, you must unzip your downloaded tar or gzip file, and get a folder, however this is still not the link path you want.

.
├── en_core_web_md-1.2.1
│   ├── deps
│   │   ├── config.json
│   │   └── model
│   ├── meta.json
│   ├── ner
│   │   ├── config.json
│   │   └── model
│   ├── pos
│   │   ├── config.json
│   │   └── model
│   └── vocab
│       ├── gazetteer.json
│       ├── lexemes.bin
│       ├── oov_prob
│       ├── serializer.json
│       ├── strings.json
│       └── vec.bin

you must link the folder with the structure. which spacy will link the folder via your link-shortcut name.

here is the link script you need:

base_path=`pwd`
sudo python3 -m spacy link ${base_path}/en_core_web_md-1.2.1 en_core_web --force

you can create a .sh file just alongside that folder and run it.

that's it!

Baltic answered 29/3, 2017 at 14:24 Comment(0)
E
1

The Symbolic link privilege not held error usually occurs when you've installed spaCy and the models into a system directory, but your user does not have the required permissions to create symbolic links. To solve this, either run download or link again as administrator or, if that's not possible, use a virtualenv to install everything into a user directory instead (for more info on this, see the troubleshooting docs).

As of v1.7.0, spaCy creates symlinks aka. shortcut links in the spacy/data directory. This makes it easier to store your models wherever you want, install them as Python packages and load them using custom names, e.g. spacy.load('my_model').

What likely happened in your case is that spaCy failed to set up this link because of the permissions error, and now can't find and load the model – including vocab/strings.json. (The way spaCy failed here is unideal, though – this has since been fixed in v1.7.3.)

Since the model is already installed, all you'd have to do is create a new symlink for it (either as admin, or in a virtualenv):

python -m spacy link en_core_web_sm en

(If you've downloaded a different model, simply replace en_core_web_sm with the name of that model. en is the shortcut to use and can be any name you want.)

Edit: In case you only want to use the tokenizer and don't care about the models, or want to use one of the supported languages that don't yet come with a statistical model, you can also just import the Language class in v1.7.3:

from spacy.fr import French
nlp = French()
Emaciation answered 28/3, 2017 at 14:51 Comment(1)
Thanks a lot, your description was really helpful.Exhale

© 2022 - 2024 — McMap. All rights reserved.