i have list of words like amazing, interesting, love, great, nice. And i want to check if word is adjective or verb , like "love" is verb and nice is adjective... How to do it using python, or nltk, any help ?
How to check a word if it is adjective or verb using python nltk?
Asked Answered
The only way to guess what a word is without having any context is to use WordNet, but it won't be 100% reliable since for example "love" can have different roles in a sentence.
from nltk.corpus import wordnet as wn
words = ['amazing', 'interesting', 'love', 'great', 'nice']
for w in words:
tmp = wn.synsets(w)[0].pos()
print w, ":", tmp
Will output:
amazing : v
interesting : v
love : n
great : n
nice : n
Also since the question tagged
parsing
, I am assuming there might be some cases where the token is not a word at all (just had this issue myself). In that case, make sure you check the output of wn.synsets(w)
before you try to index into the list. –
Purificator If I put the word 'urgent' I get: s What does 's' mean? –
Nard
I think it's ADJECTIVE SATELLITE (wordnet.princeton.edu/documentation/wndb5wn) –
Nard
Definitely some false positives here, too - 'interesting' is not a verb, 'run' is a verb, yet appears as a noun. –
Geologize
@Geologize "He is interesting me to start my own business." Sure a queer way of phrasing it, but "to interest someone" is definitely also a verb. –
Wasteful
An update to @Alex solution:
- To only include synsets that belong to word w (not the first synset)
- To list all pos tags that the word w gets
Code:
from nltk.corpus import wordnet as wn
words = ['amazing', 'interesting', 'love', 'great', 'nice']
pos_all = dict()
for w in words:
pos_l = set()
for tmp in wn.synsets(w):
if tmp.name().split('.')[0] == w:
pos_l.add(tmp.pos())
pos_all[w] = pos_l
print pos_all
Output:
{'interesting': set([u'a']),
'amazing': set([u's']),
'love': set([u'v', u'n']),
'great': set([u's', u'n']),
'nice': set([u'a', u's', u'n'])}
© 2022 - 2024 — McMap. All rights reserved.
from nltk import wordnet as wn; wn.synsets('amazing')[0].pos()
orimport nltk; nltk.pos_tag(['amazing'])
. But as said in the previous comments, the outputs will not be conclusive. – Raynaraynah