Disabling part of the nlp pipeline
Asked Answered
S

2

17

I am running spaCy v2.x on a windows box with python3. I do not have admin privelages, so i have to call the pipeline as:

nlp = en_core_web_sm.load()

When I run my same script on a *nix box, I can load the pipeline as:

nlp = spacy.load('en', disable = ['ner', 'tagger', 'parser', 'textcat'])

All I am do is tokenizing, so I do not need the entire pipeline. On the windows box, if I load the pipeline like:

nlp = en_core_web_sm.load(disable = ['ner', 'tagger', 'parser', 'textcat'])

Does that actually disable the components?

spaCy information on the nlp pipeline

Sverre answered 20/12, 2018 at 14:28 Comment(0)
S
13

You can check the current pipeline components by

print(nlp.pipe_names)

If you are not convinced by the output, you can manually check by trying to use the component and try to print the output. E.g try to disable parser and print dependency tags.

Stratocracy answered 21/12, 2018 at 12:43 Comment(1)
This is only the first step you do to disable. The other answer came afterwards, but has all of the steps (though mind that it is only spaCy v2, check the first comment for spaCy v3).Triturate
P
13

As the documentation says, you can remove parts of the pipeline without loading it. Default en_core_web_sm models has the following pipes:

   import spacy
   nlp = spacy.load('en_core_web_sm')
   print(nlp.pipe_names)
   ['tagger', 'parser', 'ner']

So instead of:

    nlp = spacy.load('en_core_web_sm', disable = ['ner', 'tagger', 'parser'])
    print(nlp.pipe_names)
    []

You can do:

    nlp = spacy.load('en_core_web_sm')
    nlp.disable_pipes('ner', 'tagger', 'parser')
    print(nlp.pipe_names)
    []

Or if you need to remove only one pipe:

    nlp = spacy.load('en_core_web_sm')
    nlp.remove_pipe('ner')
    print(nlp.pipe_names)
    ['tagger', 'parser']
Purveyor answered 5/10, 2020 at 10:16 Comment(2)
As of Spacy 3.0, disable_pipes has been renamed to select_pipes. nlp.disable_pipes('ner', 'tagger', 'parser') would now be written as nlp.select_pipes(disable=["ner", "tagger", "parser"])Cribwork
It's worth noting that disabling a pipe is not the same as removing it. Disabled pipes can be restored (see here), but you can't add a pipe that is already there and merely disabled. So the best option will depend on the context in which you are using it.Layamon

© 2022 - 2024 — McMap. All rights reserved.