stemming Questions
7
Solved
I am running the following code on IDLE(Python) and I want to enter Arabic string and get the stemming for it but actually it doesn't work
>>> from nltk.stem.isri import ISRIStemmer
>&g...
3
Solved
I would like to lemmatize some Italian text in order to perform some frequency counting of words and further investigations on the output of this lemmatized content.
I am preferring lemmatizing th...
Ashien asked 30/7, 2017 at 18:41
22
I've tried PorterStemmer and Snowball but both don't work on all words, missing some very common ones.
My test words are: "cats running ran cactus cactuses cacti community communities", and both ...
Heaven asked 21/4, 2009 at 10:7
4
Solved
I am following this document clustering tutorial. As an input I give a txt file which can be downloaded here. It's a combined file of 3 other txt files divided with a use of \n. After creating a tf...
Behistun asked 3/8, 2019 at 16:23
7
Solved
I tried all the nltk methods for stemming but it gives me weird results with some words.
Examples
It often cut end of words when it shouldn't do it :
poodle => poodl
article articl
or ...
2
Solved
I want to import wordnet dictionary but when i import Dictionary form wordnet i see this error :
for l in open(WNSEARCHDIR+'/lexnames').readlines():
IOError: [Errno 2] No such file or directory: ...
Dentil asked 12/7, 2011 at 8:0
7
Solved
I'm processing some English texts in a Java application, and I need to stem them.
For example, from the text "amenities/amenity" I need to get "amenit".
The function looks like:
String stemTerm(S...
Kcal asked 22/3, 2011 at 13:14
4
Solved
Natural Language Processing (NLP), especially for English, has evolved into the stage where stemming would become an archaic technology if "perfect" lemmatizers exist. It's because stemmers change ...
Mathews asked 26/6, 2013 at 10:19
3
Solved
I'm looking for a Java library to extract keywords from a block of text.
The process should be as follows:
stop word cleaning -> stemming -> searching for keywords based on English linguist...
2
Solved
I'm developing a text analysis project using apache lucene. I need to lemmatize some text (transform the words to their canonical forms). I've already written the code that makes stemming. Using it...
Guthrie asked 9/12, 2017 at 3:29
3
Solved
I am trying to preprocess a string using lemmatizer and then remove the punctuation and digits. I am using the code below to do this. I am not getting any error but the text is not preprocessed app...
Locust asked 16/10, 2017 at 21:7
3
Solved
I have ~1,4m documents with average of characters per document of(Median:250 and Mean:470).
I want to perform spell check and stemming, before classifying them.
Simulated document:
sentence <...
Nagual asked 20/2, 2020 at 12:8
5
Solved
I'm building a search engine for a website where users can be of many different countries and post text content.
I'll consider that:
- A french generates content in french and english
- A german g...
Germanism asked 14/6, 2012 at 22:16
1
Solved
Does keras tokenizer provide the functions such as stemming and lemmetization? If it does, then how is it done? Need an intuitive understanding. Also, what does text_to_sequence do in that?
Flotage asked 12/6, 2019 at 7:33
3
Solved
Do anyone know how to resolve this file reading error in TreeTagger that is a common Natural Language Processing tool used to POS tag, lemmatize and chunk sentences?
alvas@ikoma:~/treetagger$ echo...
Decencies asked 19/3, 2013 at 15:17
3
I'm writing a text classification system in Python. This is what I'm doing to canonicalize each token:
lem, stem = WordNetLemmatizer(), PorterStemmer()
for doc in corpus:
for word in doc:
lemma ...
Unshroud asked 19/3, 2018 at 1:44
4
Solved
I use a lucene snowball analyzer to perform stemming . The results are not meaningful words . I referred this question .
One of the solution is to use a database that contains a map between the s...
Etam asked 28/2, 2012 at 11:30
1
I am looking for a good stemmer for Hebrew - I found nothing at all using Google...
On the HebMorph site it says that:
Stem and Lemma originally have different meanings, but for Semitic languages...
Moen asked 6/1, 2014 at 15:39
2
I am trying to get the basic english word for an english word which is modified from its base form. This question had been asked here, but I didnt see a proper answer, so I am trying to put it this...
Lapland asked 7/11, 2014 at 7:1
3
Solved
How can I get my regular expression to match against just one condition exactly?
For example I have the following regular expression:
(\w+)(?=ly|es|s|y)
Matching the expression against the word...
3
Solved
I have txt files that look like this:
word, 23
Words, 2
test, 1
tests, 4
And I want them to look like this:
word, 23
word, 2
test, 1
test, 4
I want to be able to take a txt file in Python and...
2
Solved
I try to make lemmatization, ie identifying the lemma and possibly the Arabic root of a verb, for example:
يتصل ==> lemma (infinitive of the verb) ==> اتصل ==> root (triliteral root / Jidr thoulath...
Seldun asked 19/3, 2015 at 17:33
1
Solved
I'm thinking about use word n-grams techniques on a raw text. But I have a doubt:
does it have sense use word n-grams after applying lemma/stemming on text? If not, why should I use word n-grams o...
Olivares asked 10/11, 2017 at 9:22
1
Solved
So, I am new to using Python and NLTK. I have a file called reviews.csv which consists of comments extracted from amazon. I have tokenized the contents of this csv file and written it to a file cal...
5
Solved
I need a good python module for stemming text documents in the pre-processing stage.
I found this one
http://pypi.python.org/pypi/PyStemmer/1.0.1
but i cannot find the documentation int the lin...
Rubble asked 29/4, 2012 at 3:11
1 Next >
© 2022 - 2025 — McMap. All rights reserved.