SpaCy doesn't support the Arabic language, but Can I use SpaCy with the pretrained Arabert model?
Is it possible to modify this code so it can accept bert-large-arabertv02 instead of en_core_web_lg?
!python -m spacy download en_core_web_lg
import spacy
nlp = spacy.load("en_core_web_lg")
Here How we can call AraBertV.02
from arabert.preprocess import ArabertPreprocessor
from transformers import AutoTokenizer, AutoModelForMaskedLM
model_name="aubmindlab/bert-large-arabertv02"
arabert_prep = ArabertPreprocessor(model_name=model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForMaskedLM.from_pretrained(model_name)
nlp.create_pipe
with a custom component name that's not registered on the current language class. If you're using a Transformer, make sure to install 'spacy-transformers'. If you're using a custom component, make sure you've added the decorator@Language.component
(for function components) or@Language.factory
(for class components). – Unlive