I am trying to understand how sklearn
's MLP Classifier retrieves its results for its predict_proba
function.
The website simply lists:
Probability estimates
While many others, such as logistic regression, have more detailed answers: Probability estimates.
The returned estimates for all classes are ordered by the label of classes.
For a multi_class problem, if multi_class is set to be “multinomial” the softmax function is used to find the predicted probability of each class. Else use a one-vs-rest approach, i.e calculate the probability of each class assuming it to be positive using the logistic function. and normalize these values across all the classes.
Other model types, too, have more detail. Take for example a support vector machine classifier
And there is also this very nice Stack Overflow post which explains it in depth.
Compute probabilities of possible outcomes for samples in X.
The model need to have probability information computed at training time: fit with attribute probability set to True.
Other Examples
Predict class probabilities for X.
The predicted class probabilities of an input sample are computed as the mean predicted class probabilities of the trees in the forest. The class probability of a single tree is the fraction of samples of the same class in a leaf.
I am looking to understand the same thing as the above post, but for the MLPClassifier
. How does the MLPClassifier
work internally?