stop-words Questions

7

Solved

I'm building a search for a site, which utilizes a fulltext search. The search itself works great, that's not my problem. I string together user provided keywords (MATCH... AGAINST...) with AND's s...
Ferromagnetism asked 1/10, 2012 at 18:30

14

I have a dataset from which I would like to remove stop words. I used NLTK to get a list of stop words: from nltk.corpus import stopwords stopwords.words('english') Exactly how do I compare the d...
Haldas asked 30/3, 2011 at 12:36

1

When we look at HuggingFaceHub model usage in langchain there's this part that the author doesn't know how to stop the generation, https://github.com/hwchase17/langchain/blob/master/langchain/llms/...

9

Solved

Here is my code: for (int i = 0; i < myarraylist.size(); i++) { for (int j = 0; j < stopwords.size(); j++) { if (stopwords.get(j).equals(myarraylist.get(i))) { myarraylist.remove(i); id....
Oppilate asked 15/4, 2015 at 16:45

4

Solved

I am following this document clustering tutorial. As an input I give a txt file which can be downloaded here. It's a combined file of 3 other txt files divided with a use of \n. After creating a tf...
Behistun asked 3/8, 2019 at 16:23

8

Solved

What is the best way to add/remove stop words with spacy? I am using token.is_stop function and would like to make some custom changes to the set. I was looking at the documentation but could not f...
Fairchild asked 15/12, 2016 at 18:11

11

Solved

I have a string with lots of words and I have a text file which contains some Stopwords which I need to remove from my String. Let's say I have a String s="I love this phone, its super fast and ...
Halley asked 29/12, 2014 at 8:48

6

Solved

I am trying to remove stopwords from a string of text: from nltk.corpus import stopwords text = 'hello bye the the hi' text = ' '.join([word for word in text.split() if word not in (stopwords.word...
Weinshienk asked 24/10, 2013 at 8:13

2

I already imported stopwords from nltk.corpus, but I get STOPWORDS is not defined error. Below is my code: import nltk from nltk.corpus import stopwords #Create stopword list: stopwords = set(STOPW...
Danelaw asked 13/1, 2022 at 15:18

4

Solved

I would like to add certain words to the default stopwords list used in wordcloud. Current code: all_text = " ".join(rev for rev in twitter_clean.text) stop_words = ["https", "co", "RT"] wordcloud...
Crash asked 1/1, 2019 at 17:20

6

Solved

Where can I find a list of Hebrew stop words?
Oram asked 2/9, 2009 at 1:49

7

Solved

I am trying to start a project of sentiment analysis and I will use the stop words method. I made some research and I found that nltk have stopwords but when I execute the command there is an error...
Anton asked 1/11, 2014 at 22:5

10

I have some code that removes stop words from my data set, as the stop list doesn't seem to remove a majority of the words I would like it too, I'm looking to add words to this stop list so that it...
Clepsydra asked 1/4, 2011 at 9:49

6

Solved

I am trying to process a user entered text by removing stopwords using nltk toolkit, but with stopword-removal the words like 'and', 'or', 'not' gets removed. I want these words to be present after...
Anoint asked 2/10, 2013 at 5:29

6

Solved

I have a Corpus in R using the tm package. I am applying the removeWords function to remove stopwords tm_map(abs, removeWords, stopwords("english")) Is there a way to add my own custom stop wor...
Grosbeak asked 26/8, 2013 at 14:22

3

Solved

I have a data frame with strings that I'd like to remove stop words from. I'm trying to avoid using the tm package as it's a large data set and tm seems to run a bit slowly. I am using the tm stopw...
Derive asked 6/3, 2013 at 17:15

3

I am trying to tokenize and remove stop words from a txt file with Lucene. I have this: public String removeStopWords(String string) throws IOException { Set<String> stopWords = new HashSet...
Blackcap asked 12/7, 2013 at 23:17

0

I remove stop words from a String and return the remaining words with the original upper/lower case afterwards, using Apache's Lucene (8.6.3) and the following Java 8 code (this is a shortened vers...
Gunyah asked 16/10, 2020 at 9:0

1

Solved

I'm removing stop words from a String, using Apache's Lucene (8.6.3) and the following Java 8 code: private static final String CONTENTS = "contents"; final String text = "This is a ...
Baluchi asked 12/10, 2020 at 16:41

3

Solved

Is there a way to get the StopWord list that my SQL Server 2008 FullText Catalog is using? And use it, in my C# codebehind? I want to use it in a ASP.NET page that I use to search terms and highli...
Elyse asked 11/2, 2011 at 19:6

2

I want to parse the document using stanford nlp and remove stopwords from that, so my question is how to remove stopwords using stanford nlp is there any api to remove that, I find StopWords class ...
Tina asked 25/7, 2013 at 3:56

3

I want to add a few more words to stop_words in TfidfVectorizer. I followed the solution in Adding words to scikit-learn's CountVectorizer's stop list . My stop word list now contains both ...

3

Solved

I'm struggling with NLTK stopword. Here's my bit of code.. Could someone tell me what's wrong? from nltk.corpus import stopwords def removeStopwords( palabras ): return [ word for word in palab...
Marathon asked 4/4, 2011 at 16:53

3

Solved

I'm wondering where I can find the full list of supported langs (and their keys) for the NLTK stopwords. I find a list in https://pypi.org/project/stop-words/ but it does not contain the keys for ...
Annecorinne asked 7/2, 2019 at 12:55

2

Solved

I have managed to evaluate the tf-idf function for a given corpus. How can I find the stopwords and the best words for each document? I understand that a low tf-idf for a given word and document me...
Abruzzi asked 4/6, 2013 at 21:8

© 2022 - 2024 — McMap. All rights reserved.