creating word2vec model syn1neg.npy extension
Asked Answered
M

1

7

When creating model,there is not any more model with extension finish

.syn1neg.npy

syn0.npy

My code is below:

corpus= x+y
tok_corp= [nltk.word_tokenize(sent.decode('utf-8')) for sent in corpus]
model = gensim.models.Word2Vec(tok_corp, min_count=1, size = 32)
model.save('/home/Desktop/test_model')

model = gensim.models.Word2Vec.load('/home/kafein/Desktop/chatbot/test_model')

There is only 1 model file

test_model

Which part i am wrong ?

Mustang answered 24/4, 2017 at 12:37 Comment(0)
B
19

Gensim's native .save() only saves off parts of the model into such separate files (like test_model.syn1neg.npy etc) if they are larger than a certain threshold. When they're small, they get "pickled" up into the single model save file.

So there's no problem/error here. If you start training a larger model with more words, you may see those other files re-appear. (When you do, be sure to keep them alongside the main test_model file, if copying/moving them elsewhere – all the files together are needed to re-load() the model.)

Bevin answered 15/5, 2017 at 22:47 Comment(1)
If the model can be made to load faster or at once(even if slower) or is there any other way to keep the model loaded and as and when we have a query we can call it . I made a model which has been saved as 3 separate files namely, model.trainables.syn1neg.npy(414.6 mb) , model(30 mb) and model.wv.vectors.npy(414.6 mb ). Am I going correct? Also can I load them for once and use them later on for querying my inputs? As for loading the model as of whole it takes me about 5-10 minutes and I have kept the queries in a loop which starts running normally after the model has been utlized once.Chloromycetin

© 2022 - 2024 — McMap. All rights reserved.