Named entity recognition in Spacy

Asked 11/1, 2018 at 5:48 Answered 31/12, 2020 at 19:14

Solved python named-entity-recognition spacy

I am trying to find Named entities for a sentence as below

import spacy.lang.en
parser = spacy.lang.en.English()
ParsedSentence = parser(u"Alphabet is a new startup in China")
for Entity in  ParsedSentence.ents:  
    print (Entity.label, Entity.label_, ' '.join(t.orth_ for t in Entity))

I am expecting to get the result "Alphabet","China" but I am getting an empty set as result. What am I doing wrong here

Kallman answered 11/1, 2018 at 5:48 Comment(1)

NER is based on training input data. Therefore, for your example, it might not know from the limited context that "Alphabet" is a named entity. Try more examples. – Caramel 11/1, 2018 at 16:4

As per spacy documentation for Name Entity Recognition here is the way to extract name entity

import spacy
nlp = spacy.load('en') # install 'en' model (python3 -m spacy download en)
doc = nlp("Alphabet is a new startup in China")
print('Name Entity: {0}'.format(doc.ents))

Result
Name Entity: (China,)

To make "Alphabet" a 'Noun' append it with "The".

doc = nlp("The Alphabet is a new startup in China")
print('Name Entity: {0}'.format(doc.ents))

Name Entity: (Alphabet, China)

Egotist answered 11/1, 2018 at 7:10 Comment(1)

But if the input sentence was I love biscuits, chocolate and bicyles., shouldn't the PRODUCT entity be identified (for biscuits, chocolate and bicycles)? The doc suggests PRODUCT is for food, vehicles, etc. However, doc.ents doesn't identify any entity. – Quillen 3/8, 2018 at 15:43

In Spacy version 3 the Transformers from Hugging Face are fine-tuned to the operations that Spacy provided in previous versions, but with better results.

Transformers are currently (2020) the state-of-art in Natural Language Processing, i.e generally we had (one-hot-encode -> word2vec -> glove | fast text) then (recurrent neural network, recursive neural network, gated recurrent unit, long short-term memory, bi-directional long short-term memory, etc) and now Transformers + Attention (BERT, RoBERTa, XLNet, XLM, CTRL, AlBERT, T5, Bart, GPT, GPT-2, GPT-3) - This is just to give context for 'why' you should consider Transformers, I know that there are lots of stuff that I didn't mention like Fuzz, Knowledge Graph and so on

Install the dependencies:

sudo apt install libncurses5

pip install spacy-transformers --pre -f https://download.pytorch.org/whl/torch_stable.html

pip install spacy-nightly # I'm using 3.0.0rc2

Download a model:

python -m spacy download en_core_web_trf # English Transformer pipeline, Roberta base

Here's a list of available models.

And then use it as you would normally do:

import spacy


text = 'Type something here which can be related to something, e.g Stack Over Flow organization'

nlp = spacy.load('en_core_web_trf')

document = nlp(text)

print(document.ents)

References:

Learn about Transformers and Attention.

Read a summary about the different Trasnformers architectures.

Learn about the Transformers fine-tune done by Spacy.

Diminuendo answered 31/12, 2020 at 19:14 Comment(0)

Recommended topics

Hot tags