Are there any efficient python libraries for Dynamic Topic Models, preferably extending Gensim?
Asked Answered
Q

3

6

I'm trying to model twitter stream data with topic models. Gensim, being an easy to use solution, is impressive in it's simplicity. It has a truly online implementation for LSI, but not for LDA. For a changing content stream like twitter, Dynamic Topic Models are ideal. Is there any way, or even a hack - an implementation or even a strategy, using which I can utilize Gensim for this purpose?

Are there any other python implementations which derive (preferably) from Gensim or independent? I am preferring python, since I want to get started asap, but if there is an optimum solution with some work, please mention it.

Thanks.

Quinine answered 18/3, 2014 at 2:52 Comment(0)
C
3

Gensim (http://radimrehurek.com/gensim/models/dtmmodel.html) has a python wrapper for the orig. C++ code.

Courtund answered 31/12, 2014 at 4:46 Comment(0)
T
3

The DTM wrapper in Gensim is working, but none of the documentation is particularly complete at this time. On the Gensim side, the most useful thing to look at is the DTM example buried in docs/notebooks. This shows you what all of the input variables need to look like. A couple of things to note:

  • the DTM model has been moved into gensim.models.wrappers.dtmmodel
  • initialize_lda=True must be set because of a bug in the DTM code (this will be the default in future -- PR #676)

You'll also need a working compiled version of DTM itself (you provide the path to that executable). You can try using the appropriate executable from a github repo, but if that doesn't work you'll probably need to compile the original code by running the included makefile.

Tini answered 2/5, 2016 at 16:28 Comment(0)
R
2

Having talked with David Blei and John Lafferty about exactly this, the answer right now is no, there aren't.

Sean Gerrish's DTM implementation works with a documented memory leak, but works on manageable collections.

Rheometer answered 23/4, 2014 at 17:20 Comment(1)
Thank you for the reply. This implementation is in C++, directly referenced from Blei's page, which is by far the only endorsed/mature implementation I could come across. I am still looking an implementation in Python.Quinine

© 2022 - 2024 — McMap. All rights reserved.