use tree2conlltags from nltk.chunk. Also ne_chunk needs pos tagging which tags word tokens (thus needs word_tokenize).
from nltk import word_tokenize, pos_tag, ne_chunk
from nltk.chunk import tree2conlltags
sentence = "Mark and John are working at Google."
print(tree2conlltags(ne_chunk(pos_tag(word_tokenize(sentence))
"""[('Mark', 'NNP', 'B-PERSON'),
('and', 'CC', 'O'), ('John', 'NNP', 'B-PERSON'),
('are', 'VBP', 'O'), ('working', 'VBG', 'O'),
('at', 'IN', 'O'), ('Google', 'NNP', 'B-ORGANIZATION'),
('.', '.', 'O')] """
This will give you a list of tuples: [(token, pos_tag, name_entity_tag)]
If this list is not exactly what you want, it is certainly easier to parse the list you want out of this list then an nltk tree.
Code and details from this link; check it out for more information
You can also continue by only extracting the words, with the following function:
def wordextractor(tuple1):
#bring the tuple back to lists to work with it
words, tags, pos = zip(*tuple1)
words = list(words)
pos = list(pos)
c = list()
i=0
while i<= len(tuple1)-1:
#get words with have pos B-PERSON or I-PERSON
if pos[i] == 'B-PERSON':
c = c+[words[i]]
elif pos[i] == 'I-PERSON':
c = c+[words[i]]
i=i+1
return c
print(wordextractor(tree2conlltags(nltk.ne_chunk(nltk.pos_tag(nltk.word_tokenize(sentence))))
Edit Added output docstring
**Edit* Added Output only for B-Person
ne_chunk()
return instead? What exactly are you stuck at? – Dunkinnltk.ne_chunk(nltk.pos_tag(nltk.word_tokenize("Welcome to Barbados, Tobdy!")))
– Fulminant