Is there an alternate for the now removed module 'nltk.model.NGramModel'?
Asked Answered
W

1

12

I've been trying to find out an alternative for two straight days now, and couldn't find anything relevant. I'm basically trying to get a probabilistic score of a synthesized sentence (synthesized by replacing some words from an original sentence picked from the corpora).

I tried Collocations, but the scores that I'm getting aren't very helpful. So I tried making use of the language model concept, only to find that the seemingly helpful module 'model' has been removed from NLTK because of some bugs.

It'd be really great if someone could either let me know about some alternate way to get the ngram model implementation in python, or better yet, suggest me some other way to solve the problem of 'scoring' the sentence.

Whitening answered 18/10, 2014 at 18:24 Comment(1)
I have manually downloaded version 3.0a1 (the last one containing model) and I am using that one, not without pain. The "official recommendation" is to use the latest versions in the model branch. However, I have no idea about how to use that, all the information online to get things done refers to the old model package in version 3.0a1, so I decided to use it. I have not used much Python nor nltk, but my impression was that both were more mature and had stronger community support.Witter
E
15

According to this open issue on the nltk repo, NGramModel is currently not in master because of some bugs. Their current solution is to install the code from the model branch. This is about 8 months behind master though, so you might miss out on other features and bug fixes.

pip install https://github.com/nltk/nltk/tarball/model

The relevant code is here in the model branch. You could copy this to your local code if you don't want to use the outdated branch. If you really care about using this you could try to fix the outstanding bugs on it and submit a pull request.

Evvie answered 18/10, 2014 at 18:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.