text-analysis

3

I am using twitter API to generate sentiments. I am trying to generate a word-cloud based on tweets. Here is my code to generate a wordcloud wordcloud(clean.tweets, random.order=F,max.words=80,...

r text-analysis word-cloud sttwitterapi

Melodize asked 28/11, 2017 at 5:31

3

difflib.get_close_matches GET SCORE

I am trying to get the score of the best match using difflib.get_close_matches: import difflib best_match = difflib.get_close_matches(str,str_list,1)[0] I know of the option to add 'cutoff' par...

python-2.7 text text-analysis

Coleen asked 29/3, 2016 at 11:47

4

Find all locations / cities / places in a text

If I have a text containing for example an article of a newspaper in Catalan language, how could I find all cities from that text? I have been looking at the package nltk for python and I have dow...

python nltk corpus text-analysis tagged-corpus

Groupie asked 10/5, 2015 at 10:0

5

Error using langdetect in python: "No features in text"

Hey I have a csv with multilingual text. All I want is a column appended with a the language detected. So I coded as below, from langdetect import detect import csv with open('C:\\Users\\dell\\Do...

python text-analysis language-detection

Mackle asked 24/11, 2016 at 10:6

4

Extract words from PDF with golang?

I don't understand type conversion. I know this isn't right, all I get is a bunch of hieroglyphs. f, _ := os.Open("test.pdf") defer f.Close() io.Copy(os.Stdout, f) I want to work with the strings...

pdf go text-analysis

Janijania asked 2/10, 2016 at 4:33

2

Solved

How to combine TFIDF features with other features

I have a classic NLP problem, I have to classify a news as fake or real. I have created two sets of features: A) Bigram Term Frequency-Inverse Document Frequency B) Approximately 20 Features ass...

machine-learning nlp text-analysis

Vindicate asked 1/2, 2018 at 23:2

4

Solved

Stemmers vs Lemmatizers

Natural Language Processing (NLP), especially for English, has evolved into the stage where stemming would become an archaic technology if "perfect" lemmatizers exist. It's because stemmers change ...

nlp wordnet stemming text-analysis lemmatization

Mathews asked 26/6, 2013 at 10:19

2

Error faced while using TM package's VCorpus in R

I am facing the below error while working on the TM package with R. library("tm") Loading required package: NLP Warning messages: 1: package ‘tm’ was built under R version 3.4.2 2: package ‘NLP’...

r text-mining tm text-analysis

Sphenoid asked 21/11, 2017 at 6:27

3

Solved

How can I compute TF/IDF with SQL (BigQuery)

I'm doing text analysis over reddit comments, and I want to calculate the TF-IDF within BigQuery.

sql google-bigquery text-analysis

Ferous asked 31/10, 2017 at 5:42

2

Solved

Automatically extracting strings with mismatched spellings from a column and replacing them in R [closed]

I have a huge dataset which is similar to the columns posted below NameofEmployee <- c(x, y, z, a) Region <- c("Pune", "Orissa", "Orisa", "Poone") As you can see, in the Region colum...

r string text-analysis

Rapture asked 24/7, 2018 at 6:28

2

How to conceptually think about relationship between tokenized words and word embeddings?

I have been using JJ Allaire's guide to using word embeddings in neural network model for text processing (https://jjallaire.github.io/deep-learning-with-r-notebooks/notebooks/6.1-using-word-embedd...

r nlp keras text-analysis

Watcher asked 4/5, 2018 at 23:32

3

Solved

Use brain.js neural network to do text analysis

I'm trying to do some text analysis to determine if a given string is... talking about politics. I'm thinking I could create a neural network where the input is either a string or a list of words (...

neural-network text-analysis brain.js

Blinders asked 5/5, 2016 at 6:10

2

Solved

How to classify new documents with tf-idf?

If I use the TfidfVectorizer from sklearn to generate feature vectors as: features = TfidfVectorizer(min_df=0.2, ngram_range=(1,3)).fit_transform(myDocuments) How would I then generate feature ve...

python scikit-learn text-mining tf-idf text-analysis

Lionel asked 18/10, 2016 at 15:32

1

Solved

Compute word n-grams on original text or after lemma/stemming process?

I'm thinking about use word n-grams techniques on a raw text. But I have a doubt: does it have sense use word n-grams after applying lemma/stemming on text? If not, why should I use word n-grams o...

information-retrieval n-gram text-analysis stemming lemmatization

Olivares asked 10/11, 2017 at 9:22

1

Categories Busineesses with Text analytics in Python

I'm a new-bee to AI and want to perform the below exercise. Can you please suggest the way to achieve it using python: Scenario - I have list of businesses of some companies as below like: 1. AI...

python machine-learning artificial-intelligence text-mining text-analysis

Kitchenmaid asked 25/10, 2017 at 5:28

2

Solved

What do the parameters of the csvIterator mean in Mallet?

I am using mallet topic modelling sample code and though it runs fine, I would like to know what the parameters of this statement actually mean? instances.addThruPipe(new CsvIterator(new FileReade...

machine-learning nlp topic-modeling text-analysis mallet

Sachi asked 13/1, 2015 at 17:4

3

Are there any efficient python libraries for Dynamic Topic Models, preferably extending Gensim?

I'm trying to model twitter stream data with topic models. Gensim, being an easy to use solution, is impressive in it's simplicity. It has a truly online implementation for LSI, but not for LDA. Fo...

python lda text-analysis topic-modeling gensim

Quinine asked 18/3, 2014 at 2:52

1

Solved

Convert sparse matrix (csc_matrix) to pandas dataframe

I want to convert this matrix into a pandas dataframe. csc_matrix The first number in the bracket should be the index, the second number being columns and the number in the end being the data. I ...

python pandas dataframe text-analysis word-frequency

Apnea asked 13/4, 2016 at 2:53

1

Solved

ValueError: Found arrays with inconsistent numbers of samples [ 6 1786]

Here is my code: from sklearn.svm import SVC from sklearn.grid_search import GridSearchCV from sklearn.cross_validation import KFold from sklearn.feature_extraction.text import TfidfVectorizer fro...

python machine-learning scikit-learn text-analysis

Corridor asked 13/2, 2016 at 11:18

3

Solved

How to remove stopwords efficiently from a list of ngram tokens in R

Here's an appeal for a better way to do something that I can already do inefficiently: filter a series of n-gram tokens using "stop words" so that the occurrence of any stop word term in an n-gram ...

r performance n-gram stop-words text-analysis

Rakes asked 12/10, 2015 at 0:9

1

Solved

Transporting Sparse Matrix from Python to R

I am doing some text analysis work in Python. Unfortunately, I need to switch to R in order to use a particular package (unfortunately, the package cannot be replicated in Python easily). Current...

python r sparse-matrix text-analysis

Suppletory asked 5/6, 2015 at 21:15

1

Solved

Big Text Corpus breaks tm_map

I have been breaking my head over this one over the last few days. I searched all the SO archives and tried the suggested solutions but just can't seem to get this to work. I have sets of txt docum...

r text-mining tm text-analysis term-document-matrix

Mclain asked 9/11, 2014 at 23:30

3

Solved

Extracting text from garbled PDF [closed]

I have a PDF file with valuable textual information. The problem is that I cannot extract the text, all I get is a bunch of garbled symbols. The same happens if I copy and paste the text fro...

pdf file-format text-analysis

Yeta asked 29/8, 2012 at 18:30

3

Clustering words into groups

This is a Homework question. I have a huge document full of words. My challenge is to classify these words into different groups/clusters that adequately represent the words. My strategy to deal wi...

cluster-analysis k-means text-analysis

Litter asked 7/12, 2012 at 18:53

2

Solved

How do I use sklearn CountVectorizer with both 'word' and 'char' analyzer? - python

How do I use sklearn CountVectorizer with both 'word' and 'char' analyzer? http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html I could extract the...

python machine-learning scikit-learn analyzer text-analysis

Oid asked 6/2, 2014 at 10:27

text-analysis Questions

Recommended topics

Hot tags