I'm trying to use a forest (or tree) augmented Bayes classifier (Original introduction, Learning) in python
(preferably python 3, but python 2 would also be acceptable), first learning it (both structure and parameter learning) and then using it for discrete classification and obtaining probabilities for those features with missing data. (This is why just discrete classification and even good naive classifiers are not very useful for me.)
The way my data comes in, I'd love to use incremental learning from incomplete data, but I haven't even found anything doing both of these in the literature, so anything that does structure and parameter learning and inference at all is a good answer.
There seem to be a few very separate and unmaintained python packages that go roughly in this direction, but I haven't seen anything that is moderately recent (for example, I would expect that using pandas
for these calculations would be reasonable, but OpenBayes
barely uses numpy
), and augmented classifiers seem completely absent from anything I have seen.
So, where should I look to save me some work implementing a forest augmented Bayes classifier? Is there a good implementation of Pearl's message passing algorithm in a python class, or would that be inappropriate for an augmented Bayes classifier anyway? Is there a readable object-oriented implementation for learning and inference of TAN Bayes classifiers in some other language, which could be translated to python?
Existing packages I know of, but found inappropriate are
milk
, which does support classification, but not with Bayesian classifiers (and I defitinetly need probabilities for the classification and unspecified features)pebl
, which only does structure learningscikit-learn
, which only learns naive Bayes classifiersOpenBayes
, which has only barely changed since somebody ported it fromnumarray
tonumpy
and documentation is negligible.libpgm
, which claims to support an even different set of things. According to the main documentation, it does inference, structure and parameter learning. Except there do not seem to be any methods for exact inference.- Reverend claims to be a “Bayesian Classifier”, has negligible documentation, and from looking at the source code I am lead to the conclusion that it is mostly a Spam classifier, according to Robinson's and similar methods, and not a Bayesian classifier.
- eBay's
bayesian
Belief Networks allows to build generic Bayesian networks and implements inference on them (both exact and approximate), which means that it can be used to build a TAN, but there is no learning algorithm in there, and the way BNs are built from functions means implementing parameter learning is more difficult than it might be for a hypothetical different implementation.
milk
? – Situatemilk
does not seem to support Bayesian classifiers at all (or at least I don't see how – if any ofmilk
s classifiers gives me probabilities, please tell me), and is therefore out of scope of this question. – Gullbayesian
andlibpgm
and add my own stuff on top to get what I want. Unfortunately this is only a minor side project, so that may take some time. – Gull