Using .save_word2vec_format()
saves just the full-word vectors, to a simple format that was used by Google's original word2vec.c
release. It doesn't save unique things about a full FastText model. Such files would be reloaded with the matched .load_word2vec_format()
.
The .load_facebook_format()
method loads files in the format saved by Facebook's original (non-Python) FastText code release. (The name of this method is pretty misguided, since 'facebook' could mean so many different things other than a specific data format.) Gensim doesn't have a matched method for saving to this same format – though it probably wouldn't be very hard to implement, and would make symmetric sense to support this export option.
Gensim's models typically implement gensim-native .save()
and .load()
options, which make use of a mix of Python 'pickle' serialization and raw large-array files. These are your best options if you want to save the full model state, for later reloading back into Gensim.
(Such files can't be loaded by other FastText implementations.)
Be sure to keep the multiple related files written by this .save()
(all with the same user-supplied prefix) together when moving the saved model to a new location.
Update (May 2020): Recent versions of gensim
such as 3.8.3 and later include a new contributed FastText.save_facebook_model()
method which saves to the original Facebook FastTExt binary format.
.bin
file using themodel.save
method. – Unawares