lda Questions

4

Solved

I am using the removeSparseTerms method in R and it required a threshold value to be input. I also read that the higher the value, the more will be the number of terms retained in the returned matr...
Tali asked 27/2, 2015 at 10:55

5

Solved

I am trying to generate a word cloud using the WordCloud module in Python, however I see the following error whenever I call .generate Traceback (most recent call last): File "/mnt/6db3226b-5...
Unrefined asked 28/4, 2023 at 12:16

1

I am currently using Gensim LDA for topic modeling. While Tuning hyper-parameters I found out that the model always gives negative log-perplexity Is it normal for model to behave like this?? (is it...
Minuend asked 22/7, 2020 at 2:30

6

Solved

I built LDA model using Gensim and I want to get the topic words only How can I get the words of the topics only no probabilities and no IDs.words only I tried print_topics() and show_topics() fu...
Stimulative asked 3/10, 2017 at 1:58

3

I read this question (Coherence score 0.4 is good or bad?) and found that the coherence score (u_mass) is from -14 to 14. But when I did my experiments, I got a score of -18 for u_mass and 0.67 for...
Auvergne asked 26/5, 2020 at 22:22

2

Here, best_model_lda is an sklearn based LDA model and we are trying to find a coherence score for this model.. coherence_model_lda = CoherenceModel(model = best_lda_model,texts=data_vectorized, d...
Gotthard asked 10/3, 2020 at 8:3

4

Solved

I want to do topic modeling on short texts. I did some research on LDA and found that it doesn't go well with short texts. What methods would be better and do they have Python implementations?
Scouring asked 3/6, 2020 at 14:32

2

Solved

When I try to run: def remove_stopwords(texts): return [[word for word in simple_preprocess(str(doc)) if word not in stop_words] for doc in texts] def make_bigrams(texts): return [bigram_mod1[d...
Cockburn asked 15/5, 2019 at 11:44

3

I am using LDAModel of pyspark to get topics from corpus. My goal is to find topics associated with each document. For that purpose I tried to set topicDistributionCol as per Docs. Since I am new t...

2

Solved

I have installed LDA plibrary (using pip) I have a very simple test code (the next two rows) import lda print lda.datasets.load_reuters() But i keep getting the error AttributeError: 'module' ob...
Barfield asked 23/7, 2016 at 0:58

5

I am a freshman in LDA and I want to use it in my work. However, some problems appear. In order to get the best performance, I want to estimate the best topic number. After reading "Finding Scien...
Appointed asked 2/7, 2013 at 9:22

3

Not sure if this is the right forum but I was wondering if anyone understands how to interpret the width of the red vs. blue bars on the right-hand side of pyLDAvis plots when lambda = 0 (see http:...
Danit asked 6/6, 2018 at 17:56

2

import pyLDAvis.gensim # Visualize the topics pyLDAvis.enable_notebook() vis = pyLDAvis.gensim.prepare(lda_model, corpus, id2word) vis The above code displayed the visualization of LDA model in go...
Faro asked 8/2, 2021 at 5:2

2

Solved

I created a Gensim LDA Model as shown in this tutorial: https://www.machinelearningplus.com/nlp/topic-modeling-gensim-python/ lda_model = gensim.models.LdaMulticore(data_df['bow_corpus'], num_topi...
Whap asked 16/2, 2020 at 8:3

5

Solved

I am using anaconda sypder and installed pyLDAvis module using command: conda install -c ehremo pyldavis Even after successful installation it shows error ModuleNotFoundError: No module named ...
Turkic asked 20/6, 2018 at 10:27

2

Solved

I am trying to use LDA MAllet model. but I am facing with "No module named 'gensim.models.wrappers'" error. I have gensim installed and ' gensim.models.LdaMulticore' works properly. Jav...
Magalymagan asked 31/3, 2021 at 8:41

1

Solved

I am going to find the optimal number of topics for LDA. To do this, I used GENSIM as follows : def compute_coherence_values(dictionary, corpus, texts, limit, start=2, step=3): coherence_values = ...
Ute asked 14/4, 2021 at 16:35

4

I tried generating topics using gensim for 300000 records. On trying to visualize the topics, I get a validation error. I can print the topics after model training, but it fails on using pyLDAvis ...
Gilletta asked 27/12, 2017 at 21:10

2

I am trying to learn about Latent Dirichlet Allocation (LDA). I have basic knowledge of machine learning and probability theory and based on this blog post http://goo.gl/ccPvE I was able to develop...
Ligula asked 16/5, 2012 at 18:48

3

Solved

I'm relative new in the world of Latent Dirichlet Allocation. I am able to generate a LDA Model following the Wikipedia tutorial and I'm able to generate a LDA model with my own documents. My step ...
Perspective asked 26/7, 2017 at 3:54

2

Solved

I am using the the Mallet LDA with gensims implemented wrapper. Now I want to get the Topic distribution of several unseen documents, store it in a nested list and then print it out. This is my c...
Uxmal asked 12/2, 2020 at 10:17

7

Solved

I am using the Gensim HDP module on a set of documents. >>> hdp = models.HdpModel(corpusB, id2word=dictionaryB) >>> topics = hdp.print_topics(topics=-1, topn=20) >>> le...
Lenssen asked 21/7, 2015 at 15:34

2

I am trying to obtain the optimal number of topics for an LDA-model within Gensim. One method I found is to calculate the log likelihood for each model and compare each against each other, e.g. at ...
Pianola asked 31/8, 2015 at 13:58

1

First, apologies for being long-winded. I'm not a mathematician, so I'm hoping there's a "dumbed down" solution to this. In short, I'm attempting to compare two bodies of text to generate...
Padova asked 7/7, 2020 at 1:30

3

I need to know whether coherence score of 0.4 is good or bad? I use LDA as topic modelling algorithm. What is the average coherence score in this context?
Fudge asked 19/2, 2019 at 9:23

© 2022 - 2024 — McMap. All rights reserved.