How do I fix ValueError when doing nlp.add_pipe(LanguageDetector(), name='language_detector', last=True) with spacy 3
Asked Answered
U

2

9

Every time I run the following code I found on Kaggle, I get ValueError. This is because of new version v3 of SpaCy:

import scispacy
import spacy
import en_core_sci_lg
from spacy_langdetect import LanguageDetector

nlp = en_core_sci_lg.load(disable=["tagger", "ner"])
nlp.max_length = 2000000
nlp.add_pipe(LanguageDetector(), name='language_detector', last=True)

ValueError: [E966] nlp.add_pipe now takes the string name of the registered component factory, not a callable component. Expected string, but got <spacy_langdetect.spacy_langdetect.LanguageDetector object at 0x00000216BB4C8D30> (name: 'language_detector').

  • If you created your component with nlp.create_pipe('name'): remove nlp.create_pipe and call nlp.add_pipe('name') instead.

  • If you passed in a component like TextCategorizer(): call nlp.add_pipe with the string name instead, e.g. nlp.add_pipe('textcat').

  • If you're using a custom component: Add the decorator @Language.component (for function components) or @Language.factory (for class components / factories) to your custom component and assign it a name, e.g. @Language.component('your_name'). You can then run nlp.add_pipe('your_name') to add it to the pipeline.

I have installed these versions:

python_version : 3.8.5
spacy.version  : '3.0.3'
scispacy.version  :  '0.4.0'
en_core_sci_lg.version  :  '0.4.0'
Urdar answered 2/3, 2021 at 4:48 Comment(1)
Does this still break if you don't disable the tagger and the ner?Shortsighted
R
6

The way add_pipe works changed in v3; components have to be registered, and can then be added to a pipeline just using their name. In this case you have to wrap the LanguageDetector like so:

import scispacy
import spacy
import en_core_sci_lg
from spacy_langdetect import LanguageDetector

from spacy.language import Language

def create_lang_detector(nlp, name):
    return LanguageDetector()

Language.factory("language_detector", func=create_lang_detector)

nlp = en_core_sci_lg.load(disable=["tagger", "ner"])
nlp.max_length = 2000000
nlp.add_pipe('language_detector', last=True)

You can read more about how this works in the spaCy docs.

Retral answered 2/3, 2021 at 5:4 Comment(3)
Glad to hear it helped. If that solved your problem, you can accept my answer by clicking the big gray checkmark to the left of it.Retral
This gives an error AttributeError: type object 'Language' has no attribute 'factory'Inhibitory
Sounds like maybe you're using spaCy v2 Amanda?Retral
G
11

You can also use a @Language.factory decorator to achieve the same result with less code :

import scispacy
import spacy
import en_core_sci_lg
from spacy_langdetect import LanguageDetector
from spacy.language import Language

@Language.factory('language_detector')
def language_detector(nlp, name):
    return LanguageDetector()

nlp = en_core_sci_lg.load(disable=["tagger", "ner"])
nlp.max_length = 2000000
nlp.add_pipe('language_detector', last=True)
Glovsky answered 25/3, 2021 at 13:41 Comment(0)
R
6

The way add_pipe works changed in v3; components have to be registered, and can then be added to a pipeline just using their name. In this case you have to wrap the LanguageDetector like so:

import scispacy
import spacy
import en_core_sci_lg
from spacy_langdetect import LanguageDetector

from spacy.language import Language

def create_lang_detector(nlp, name):
    return LanguageDetector()

Language.factory("language_detector", func=create_lang_detector)

nlp = en_core_sci_lg.load(disable=["tagger", "ner"])
nlp.max_length = 2000000
nlp.add_pipe('language_detector', last=True)

You can read more about how this works in the spaCy docs.

Retral answered 2/3, 2021 at 5:4 Comment(3)
Glad to hear it helped. If that solved your problem, you can accept my answer by clicking the big gray checkmark to the left of it.Retral
This gives an error AttributeError: type object 'Language' has no attribute 'factory'Inhibitory
Sounds like maybe you're using spaCy v2 Amanda?Retral

© 2022 - 2024 — McMap. All rights reserved.