Displacy Custom Colors for custom entities using Displacy
Asked Answered
R

2

5

I have a list of words, noun-verb phrases and I want to:

  • Search dependency patterns, words, in a corpus of text
  • identify the paragraph that matches appears in
  • extract the paragraph
  • highlight the matched words in the paragraph
  • create a snip/jpeg of the paragraph with matched words highlighted
  • save the image in an excel.

The MWE below pertains to highlighting the matched words and displaying them using displacy. I have mentioned the rest of my task just to provide the context. The output isn't coloring the custom entities with custom colors.

import spacy
from spacy.matcher import PhraseMatcher
from spacy.tokens import Span

good = ['bacon', 'chicken', 'lamb','hot dog']
bad = [ 'apple', 'carrot']

nlp = spacy.load('en_core_web_sm')  
patterns1 = [nlp(good) for good in good]
patterns2 = [nlp(bad) for bad in bad]
matcher = PhraseMatcher(nlp.vocab)
matcher.add('good', None, *patterns1)
matcher.add('bad', None, *patterns2)

doc = nlp("I like bacon and chicken but unfortunately I only had an apple and a carrot in the fridge")
matches = matcher(doc)

for match_id, start, end in matches:
    
    span = Span(doc, start, end, label=match_id)
    doc.ents = list(doc.ents) + [span]  # add span to doc.ents

print([(ent.text, ent.label_) for ent in doc.ents])  

The code above produces this output:

[('bacon', 'good'), ('chicken', 'good'), ('apple', 'bad'), ('carrot', 'bad')]

But when I try to custom color the entities, it doesn't seem to be working.

from spacy import displacy
colors = {'good': "#85C1E9", "bad": "#ff6961"}
options = {"ents": ['good', 'bad'], "colors": colors}

displacy.serve(doc, style='ent',options=options)

This is the output I get:

enter image description here

Radiculitis answered 12/8, 2021 at 23:18 Comment(3)
Please note your spaCy version.Broughton
@Broughton My Spacy version is 2.3.5Radiculitis
Can you upgrade spaCy? That's probably the easiest option.Broughton
B
7

I just copy/pasted your code and it works fine here. I'm using spaCy v3.1.1.

enter image description here

What does the HTML output source look like?


I was able to reproduce your issue on spaCy 2.3.5. I was able to fix it by making the labels upper-case (GOOD and BAD). I can't find a bug about this but since the models normally only use uppercase labels I guess this is an issue with older versions.

Broughton answered 13/8, 2021 at 3:55 Comment(4)
How can I see the HTML output source? I'm sorry i don't know what that means actually.Radiculitis
You can use right click -> select "inspect". In Chrome it might be F12 or ctrl+shift+i.Broughton
Amazing! Thanks for your help!Radiculitis
Googlers, the answer is: store some colours e.g. the default colours from Spacy colors = {'your_class1': "#85C1E9", "your_class2": "#ff6961", ... } and set options = {"ents": list(colors), "colors": colors} then run displacy.render(doc, style='ent', options=options)Agler
S
1

I am sharing my own, quite customized, example which uses SpaCy to visualize FrameNet annotations:

import nltk
nltk.download('framenet_v17')
from nltk.corpus import framenet as fn
from spacy import displacy
import matplotlib
import matplotlib.pyplot as plt

FRAME_NAME = "Expectation" # choice your own from fn.frames() !
FRAME_ELEMENTS = [e.name for _,e in fn.frame_by_name(FRAME_NAME)['FE'].items() if e['coreType'] == 'Core']
print(f"Frame={FRAME_NAME},CoreElements={'+'.join(FRAME_ELEMENTS)}")

COLORS = plt.cycler("color", plt.cm.Pastel2.colors) # chose your colors !
COLORS = [matplotlib.colors.to_hex(c['color']) for c in COLORS] # doesn't work work in RGB? 
COLORS = dict(zip([FRAME_NAME]+FRAME_ELEMENTS,COLORS))

for s in fn.exemplars(frame='Expectation'):

    span_labels = [dict(s)['Target'][0]+(FRAME_NAME,)]+dict(s)['FE'][0]
    span_labels = [{"start":t[0],"end":t[1],"label":t[2]} for t in span_labels]

    args = {
        "text": s['text'],
        "ents": span_labels,
    }

    displacy.render(args,style="ent",manual=True,options={"colors":COLORS})

enter image description here

Specialist answered 13/2, 2023 at 22:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.