Error when loading FastText's french pre-trained model with gensim
Asked Answered
C

2

6

I am trying to use the FastText's french pre-trained binary model (downloaded from the official FastText's github page). I need the .bin model and not the .vec word-vectors so as to approximate misspelled and out-of-vocabulary words.

However when I try to load said model, using:

from gensim.models import FastText
model = FastText.load_fasttext_format('french_bin_model_path')

I get the following error:

NotImplementedError: Supervised fastText models are not supported

What is surprising is that it works just fine when I try to load the english binary model.

I am running python 3.6 and gensim 3.5.0.

Any idea as of why it doesn't work with french vectors are welcome!

Cindiecindra answered 23/7, 2018 at 14:43 Comment(0)
F
5

I ran into the same problem and ended up using Facebook python wrapper for FastText instead of gensim's implementation.

import fastText 
model = fastText.load(path_to_french_bin)

Then you can get word vectors for out-of-vocabulary words like so:

oov_vector = model.get_word_vector(oov_word)

As for why gensim's load_fasttext_format works for the English model and not the French one I don't know!

Fitzwater answered 3/8, 2018 at 12:51 Comment(1)
That is indeed the only work around I found. Thank you!Cindiecindra
C
0

I never used FastText but the problem might be the encoding of your file. Try to change it to Utf-8 if you are macOS or to Latin-1 if you are on Windows.

Channa answered 3/8, 2018 at 13:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.