Person Name Detection using SpaCy in English Lang. Looking for Answer
Asked Answered
S

2

5

I am using Spacy and trying to detect names in the text. For example, text = 'Keras is a good package. Adam Smith uses a car of black colour. I hope Katrina is doing well in her job.'

The answer should like this: Adam Smith and Katrina.

Can anyone recommend

Sphenic answered 6/7, 2018 at 15:59 Comment(2)
What have you attempted so far? And have you had a look at the docs for pipelines?Heteropolar
I had to face a similar task recently and the other answers are right: SpaCy's NER is a good starting point. If you are interested here are a couple of repositories that may help you with the next steps: github.com/philipperemy/name-dataset github.com/leoli51/Names-OracleParticiaparticipant
S
7

spacy has an label_ called person. you have several options for the model: small, medium or large. large uses more resources to run

def find_persons(text):
    # Create Doc object
    doc2 = nlp(text)

    # Identify the persons
    persons = [ent.text for ent in doc2.ents if ent.label_ == 'PERSON']

    # Return persons
    return persons

Try nltk to find the Nouns and then pattern match the nouns for valid names:

tokenized_sent = nltk.word_tokenize(sentence)
tagged_sent = nltk.pos_tag(tokenized_sent)
nouns
pronouns
adjectives
verbs

NNP - proper noun singular
PRP - proper noun
VB - verb
DT - determinant

NNP - proper noun singular
PRP - proper noun
VB - verb
DT - determinant
Spaniel answered 4/12, 2020 at 17:16 Comment(1)
It fails on this example: Hi Mori, thank you for your response. Please get back to us if you have any other issues. Stay safe. - Syed AzeemNaughton
R
6

This is a typical Named Entity Recognition problem. Spacy has a pre-trained model to enable this, which should be accurate to detect person names.

Take a look at this code sample.

According to Spacy's annotation scheme, names are marked as PERSON.

Rosalynrosalynd answered 9/7, 2018 at 10:57 Comment(1)
Thanks, Pradip. I suppose NLTK can also solve the purpose. I found a solution. I will post it soon.Sphenic

© 2022 - 2024 — McMap. All rights reserved.