Binary vectors as y_score argument of roc_curve

About

Asked 17/2, 2014 at 12:28 Answered 18/2, 2014 at 11:4

The sklearn roc_curve docstring states:

"y_score : array, shape = [n_samples] Target scores, can either be probability estimates of the positive class, confidence values, or binary decisions."

In what situation it would make sense to set y_score to a binary vector ("binary decisions")? Wouldn't that result in a ROC curve with one point on it which kind of defies the point?

Gschu answered 17/2, 2014 at 12:28 Comment(2)

Yes. You shouldn't do that. Maybe open a PR changing the docstring and saying that that is not very advisable. – Amylolysis 18/2, 2014 at 19:2

Done: github.com/scikit-learn/scikit-learn/pull/2874 :) – Gschu 19/2, 2014 at 12:5

If you are using a classifier that does not output probability scores (e.g. svm.SVC without an explicit probability=True), there isn't a way to compute a ROC curve. As an API designer, you have two choices: raise an exception and provide the user no useful information, or plot a degenerate curve with one data point. I would argue the latter is more useful.

Reunite answered 18/2, 2014 at 11:4 Comment(1)

We actually had a student who generated those degenerate ROC curves, calculated AUC and thought everything is all right. I would lean towards raising an exception. – Gschu 19/2, 2014 at 12:5

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags