I am working on a document which should contain the key differences between using Naive Bayes (generative) and Logistic Regression (discriminative) models for text classification.
During my research, I ran into this definition for Naive Bayes model: https://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html
The probability of a document
d
being in classc
is computed as ... wherep(tk|c)
is the conditional probability of termtk
occurring in a document of classc
...
When I got to the part of comparing Generative and Discriminative models, I found this explanation on StackOverflow as accepted: What is the difference between a Generative and Discriminative Algorithm?
A generative model learns the joint probability distribution
p(x,y)
and a discriminative model learns the conditional probability distributionp(y|x)
- which you should read as "the probability of y given x".
At this point I got confused: Naive Bayes is a generative model and uses conditional probabilities, but at the same time the discriminative models were described as if they learned the conditional probabilities as opposed to the joint probabilities of the generative models.
Can someone shed some light on this please?
Thank you!
p(x,y)
to computep(y|x)
, but I'm still not seeing wherep(x,y)
is being used within the naive bayes link I shared; I only see conditional probabilites. – Opulentp(x,y)
in any formulas for Naive Bayes that you can link me to? – Opulent2. Preliminaries
. I think the explanation there is quite clear. There's also this page (machinelearningmastery.com/…) where the algorithm is shown step-by-step in Python. Even if you don't know Python it's close enough to English that I think it might help quite a bit. – Aciep(x,y)
beingp(y|x) p(y)
clarifies a lot about how the joint was implicitely there. Thanks a lot! – Opulent