AttributeError: 'Word2Vec' object has no attribute 'most_similar' (Word2Vec)

Asked 6/8, 2021 at 5:41 Answered 6/8, 2021 at 16:58

Solved python nlp gensim word2vec doc2vec

I am using Word2Vec and using a wiki trained model that gives out the most similar words. I ran this before and it worked but now it gives me this error even after rerunning the whole program. I tried to take off return_path=True but im still getting the same error

print(api.load('glove-wiki-gigaword-50', return_path=True))
model.most_similar("glass")

#ERROR:

/Users/me/gensim-data/glove-wiki-gigaword-50/glove-wiki-gigaword-50.gz
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-153-3bf32168d154> in <module>
      1 print(api.load('glove-wiki-gigaword-50', return_path=True))
----> 2 model.most_similar("glass") 

AttributeError: 'Word2Vec' object has no attribute 'most_similar'

#MODEL this is the model I used

    print(
        '%s (%d records): %s' % (
            model_name,
            model_data.get('num_records', -1),
            model_data['description'][:40] + '...',
        )
    )

Edit: here is my gensim download & output

!python -m pip install -U gensim

OUTPUT:

Requirement already satisfied: gensim in ./opt/anaconda3/lib/python3.8/site-packages (4.0.1)

Requirement already satisfied: numpy>=1.11.3 in ./opt/anaconda3/lib/python3.8/site-packages (from gensim) (1.20.1)

Requirement already satisfied: smart-open>=1.8.1 in ./opt/anaconda3/lib/python3.8/site-packages (from gensim) (5.1.0)

Requirement already satisfied: scipy>=0.18.1 in ./opt/anaconda3/lib/python3.8/site-packages (from gensim) (1.6.2)

Sexpartite answered 6/8, 2021 at 5:41 Comment(4)

Don't you mean just model.similar? – Menashem 6/8, 2021 at 5:55

@ewong it gives me this: AttributeError: 'Word2Vec' object has no attribute 'similar' – Sexpartite 6/8, 2021 at 6:22

Are there more lines to your code, or is that all? Where is model defined? – Menashem 6/8, 2021 at 7:51

@ewong there is this

for model_name, model_data in sorted(info['models'].items()):     print(         '%s (%d records): %s' % (             model_name,             model_data.get('num_records', -1),             model_data['description'][:40] + '...',         )     )

– Sexpartite 6/8, 2021 at 13:20

You are probably looking for <MODEL>.wv.most_similar, so please try:

model.wv.most_similar("glass")

Poisonous answered 6/8, 2021 at 13:0 Comment(8)

hi! I tried this but it gives me AttributeError: 'Word2Vec' object has no attribute 'vw'. i updated my post with the model I used – Sexpartite 6/8, 2021 at 13:24

Right. Interesting. Can you please post the version of the gensim library you are using too (as there were changes on the way)? – Poisonous 6/8, 2021 at 17:1

I used import gensim.models.word2vec as w2v and import gensim.downloader as api – Sexpartite 6/8, 2021 at 17:22

This is not what I asked for. Can you please run pip show gensim and post the output? – Poisonous 6/8, 2021 at 17:28

Hello, I just added them to my post at the end @Poisonous – Sexpartite 6/8, 2021 at 17:34

I just spotted it - it was a typo - wv instead of vw. Please check again! – Poisonous 6/8, 2021 at 18:45

Happy to hear! I would appreciate accepting the answer (gray tick mark on the left) and upvoting it. – Poisonous 6/8, 2021 at 20:19

actually, i realized that answer pointed to another model, not the wiki one. However, when I took off return_path=True it worked! thank you so much through – Sexpartite 7/8, 2021 at 20:3

Your shown code...

print(api.load('glove-wiki-gigaword-50', return_path=True))
model.most_similar("glass")

...doesn't assign anything into model. (Was it assigned earlier?)

And, using return_path=True there means the api.load() will only return a string path to the datafile. That'd only be interesting if you were going to use that string to then do your own loading of the data into a model.

That api.load() call without return_path=True likely returns an instance of KeyedVectors, which is a set of vectors. That's different from a full Word2Vec model, but would still support a .most_similar() method. However, if you're just print()ing that returned path, or returned model, it's not going to be in the model variable for your later .most_similar() operation.

So you may want:

kv_model = api.load('glove-wiki-gigaword-50')
similars = kv_model.most_similar('glass')
print(similars)

(Personally, I don't like the opaque magic, & running of new downloaded code, that api.load() does. I think it's a better habit to download the raw data files yourself, from a known source, so that you know what files have arrived, to which directories, on your own machine. Then use a dataset-specific load method to load that data, so that you learn what library methods work with which kinds of files.)

If your model variable does in fact include a full Word2Vec model, from some unshown other code, then it will also contain a set of vectors in its .wv (for word-vectors) property:

similars = model.wv.most_similar('glass')
print(similars)

Grimace answered 6/8, 2021 at 16:58 Comment(6)

This prints out similar words based on the training of my data. However, I would like to get the words that are trained by 'glove-wiki-gigaword-50' – Sexpartite 6/8, 2021 at 17:2

Have you tried assigning the results of your api.load() call into a variable instead of printing it? (You could assign it into model if you want to discard the Word2Vec model that's already there. Or you could assign it into a new variable like kv_model, to reflect that it's just a KeyedVectors.) – Grimace 6/8, 2021 at 17:6

I tried it and it gave me AttributeError: 'str' object has no attribute 'most_similar' – Sexpartite 6/8, 2021 at 17:39

What code did you try that gave that error? (That sounds like you assigned a string into the variable, not the results of api.load().) – Grimace 6/8, 2021 at 17:42

This is what I did: kv_model= (api.load('glove-wiki-gigaword-50', return_path=True)) (kv_model.most_similar("glass")) – Sexpartite 6/8, 2021 at 18:45

Aha, try it without the return_path=True argument. If you include that, it means you're asking for the string path to the dataset file, rather than the loaded model itself. I'll also add a note about this to my main answer. – Grimace 6/8, 2021 at 19:54

Recommended topics

Hot tags