spacy Questions

1

Solved

I am trying to train a text categorization pipe in SpaCy: import spacy nlp = spacy.load("en_core_web_sm") nlp.add_pipe("textcat", last=True) other_pipes = [pipe for pipe in nlp...
Furunculosis asked 25/2, 2021 at 11:17

2

Solved

I have an HTML document and I'd like to tokenize it using spaCy while keeping HTML tags as a single token. Here's my code: import spacy from spacy.symbols import ORTH nlp = spacy.load('en', vector...
Interplanetary asked 29/11, 2017 at 9:58

1

Solved

I know there are a lot of resources out there for this problem, but I could not get spaCy to do exactly what I want. I would like to add rules to my spaCy tokenizer so that HTML tags (such as <b...
Cadi asked 26/6, 2020 at 18:42

1

I'm using the SpaCy 3.0.1 together with the transformer model (en_core_web_trf). When I previously used SpaCy transformers it was possible to get the transformer vectors from a Token or Span. In Sp...
Embroidery asked 11/2, 2021 at 7:31

4

Solved

The below code breaks the sentence into individual tokens and the output is as below "cloud" "computing" "is" "benefiting" " major" "manufacturing" "companies" import en_core_web_sm nlp = en_c...
Anderson asked 3/12, 2018 at 16:50

2

Solved

I have the following code: import spacy from spacy import displacy from pathlib import Path nlp = spacy.load('en_core_web_sm', parse=True, tag=True, entity=True) sentence_nlp = nlp("John go home...
Catheterize asked 17/5, 2019 at 7:47

3

Solved

I have a spaCy doc that I would like to lemmatize. For example: import spacy nlp = spacy.load('en_core_web_lg') my_str = 'Python is the greatest language in the world' doc = nlp(my_str) How ca...
Fighter asked 2/8, 2018 at 16:16

4

Solved

I was trying to create a executable file using pyinstaller. I got bellow issue while executing the issue. File "test_env2_live\main.py", line 2, in <module> File "C:\Users\rajesh.das\AppDa...
Klockau asked 8/1, 2020 at 11:44

2

Solved

I am trying to find Named entities for a sentence as below import spacy.lang.en parser = spacy.lang.en.English() ParsedSentence = parser(u"Alphabet is a new startup in China") for Entity in Parse...
Kallman asked 11/1, 2018 at 5:48

2

Solved

SpaCy Version: 2.0.11 Python Version: 3.6.5 OS: Ubuntu 16.04 My Sentence Samples: Marketing-Representative- won't die in car accident. or Out-of-box implementation Expected Tokens: ["Market...
Shortterm asked 12/9, 2018 at 11:16

1

I'm currently working on a project involving sentence vectors (from a RoBERTa pretrained model). These vectors are lower quality when sentences are long, and my corpus contains many long sentences ...
Bucephalus asked 10/12, 2020 at 1:4

1

The speed is less 10KB/s when I run python -m spacy download en For example: Looking in indexes: https://pypi.mirrors.ustc.edu.cn/simple/ Collecting en_core_web_sm==2.1.0 from https://github.com/e...
Tarrah asked 22/7, 2019 at 9:1

2

Solved

I have a German text that I want to apply lemmatization to. If lemmatization is not possible, then I can live with stemming too. Data: This is my German text: mails=['Hallo. Ich spielte am frühen M...
Pedant asked 9/9, 2019 at 15:43

2

I am using Spacy and trying to detect names in the text. For example, text = 'Keras is a good package. Adam Smith uses a car of black colour. I hope Katrina is doing well in her job.' The answer s...
Sphenic asked 6/7, 2018 at 15:59

1

Is it possible to leave the token text true cased, but force the lemmas to be lowercased? I am interested in this because I want to use the PhraseMatcher where I run an input text through the piple...
Babineaux asked 9/11, 2020 at 20:23

3

Solved

With Gensim, after I've trained my own model, I can use model.wv.most_similar('cat', topn=5) and get a list of the 5 words that are closest to cat in the vector space. For example: from gensim.mode...
Aparicio asked 28/8, 2019 at 17:27

1

I am attempting to update the pre-trained BERT model using an in house corpus. I have looked at the Huggingface transformer docs and I am a little stuck as you will see below.My goal is to compute ...
Hulton asked 30/10, 2019 at 7:19

1

Solved

it would be really helpful for me if you would help me understand some underlying concepts about Spacy. I understand some spacy models have some predefined static vectors, for example, for the Span...
Frasch asked 7/10, 2020 at 23:18

2

Solved

I am running spaCy v2.x on a windows box with python3. I do not have admin privelages, so i have to call the pipeline as: nlp = en_core_web_sm.load() When I run my same script on a *nix box, I ca...
Sverre asked 20/12, 2018 at 14:28

4

I am trying to customize Spacy's NER to identify Indian names. Following this guide https://spacy.io/usage/training and this is the dataset I am using https://gist.githubusercontent.com/mbejda/9b93...
Mandler asked 26/3, 2018 at 4:25

2

I am trying to install neuralcoref and following the instructions given here. I created a jupyter notebook and try to run the following code. # Load your usual SpaCy model (one of SpaCy English m...
Ulent asked 12/7, 2019 at 14:4

1

Solved

I trying to use spacy to extract required custom entities from the text. import spacy from spacy_lookup import Entity data = {0:["count"],1:["unique count","unique"]} ...
Housekeeping asked 27/8, 2020 at 16:49

4

I have downloaded Spacy English model and finding lemma using this code. import spacy nlp = spacy.load('en') doc = nlp(u'Two apples') for token in doc: print(token, token.lemma, token.lemma_) O...
Epigene asked 4/2, 2019 at 8:15

1

Solved

I'm using some domain-specific language which have a lot of OOV words as well as some typos. I have noticed Spacy will just assign an all-zero vector for these OOV words, so I'm wondering what's th...
Biocatalyst asked 28/7, 2020 at 23:28

2

How can I make spaCy case insensitive when finding the entity name? Is there any code snippet that i should add or something because the questions could mention entities that are not in uppercase...
Carolyn asked 16/6, 2018 at 12:17

© 2022 - 2024 — McMap. All rights reserved.