How to check a word if it is adjective or verb using python nltk?
Asked Answered
W

2

11

i have list of words like amazing, interesting, love, great, nice. And i want to check if word is adjective or verb , like "love" is verb and nice is adjective... How to do it using python, or nltk, any help ?

Wendling answered 17/2, 2016 at 16:47 Comment(3)
Hmm..I don't think words have to be mutually exclusive like this. Like "to love" is the infinitive, but you can love something (verb), or be in love (now it's an adverb), or have a love bracelet or love affair (now it's an adjective)Fouts
Without context, POS of most non-noun words are not conclusive.Raynaraynah
Without context, the closest you can get is to use the 1st POS from WordNet from nltk import wordnet as wn; wn.synsets('amazing')[0].pos() or import nltk; nltk.pos_tag(['amazing']). But as said in the previous comments, the outputs will not be conclusive.Raynaraynah
A
16

The only way to guess what a word is without having any context is to use WordNet, but it won't be 100% reliable since for example "love" can have different roles in a sentence.

from nltk.corpus import wordnet as wn
words = ['amazing', 'interesting', 'love', 'great', 'nice']

for w in words:
    tmp = wn.synsets(w)[0].pos()
    print w, ":", tmp

Will output:

amazing : v
interesting : v
love : n
great : n
nice : n
Anchises answered 20/2, 2016 at 7:24 Comment(5)
Also since the question tagged parsing, I am assuming there might be some cases where the token is not a word at all (just had this issue myself). In that case, make sure you check the output of wn.synsets(w) before you try to index into the list.Purificator
If I put the word 'urgent' I get: s What does 's' mean?Nard
I think it's ADJECTIVE SATELLITE (wordnet.princeton.edu/documentation/wndb5wn)Nard
Definitely some false positives here, too - 'interesting' is not a verb, 'run' is a verb, yet appears as a noun.Geologize
@Geologize "He is interesting me to start my own business." Sure a queer way of phrasing it, but "to interest someone" is definitely also a verb.Wasteful
N
3

An update to @Alex solution:

  1. To only include synsets that belong to word w (not the first synset)
  2. To list all pos tags that the word w gets

Code:

from nltk.corpus import wordnet as wn
words = ['amazing', 'interesting', 'love', 'great', 'nice']
pos_all = dict()
for w in words:
    pos_l = set()
    for tmp in wn.synsets(w):
        if tmp.name().split('.')[0] == w:
            pos_l.add(tmp.pos())
    pos_all[w] = pos_l
print pos_all

Output:

{'interesting': set([u'a']), 
 'amazing': set([u's']), 
 'love': set([u'v', u'n']), 
 'great': set([u's', u'n']),
 'nice': set([u'a', u's', u'n'])}
Nataline answered 24/9, 2018 at 18:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.