I am trying to use the scikit-learn
module to compute AUC and plot ROC curves for the output of three different classifiers to compare their performance. I am very new to this topic, and I am struggling to understand how the data I have should input to the roc_curve
and auc
functions.
For each item within the testing set, I have the true value and the output of each of the three classifiers. The classes are ['N', 'L', 'W', 'T']
. In addition, I have a confidence score for each value output from the classifiers. How do I pass this information to the roc_curve function?
Do I need to label_binarize
my input data? How do I convert a list of [class, confidence]
pairs output by the classifiers into the y_score
expected by roc_curve
?
Thank you for any help! Good resources about ROC curves would also be helpful.
[class, confidence score]
pairs and convert them into an appropriatey_score
array? edit: assume that a higher confidence score always indicates the 'I' class; a 0 confidence result will always be an 'N' (None). – Bartko