Binary vectors as y_score argument of roc_curve
Asked Answered
G

1

5

The sklearn roc_curve docstring states:

"y_score : array, shape = [n_samples] Target scores, can either be probability estimates of the positive class, confidence values, or binary decisions."

In what situation it would make sense to set y_score to a binary vector ("binary decisions")? Wouldn't that result in a ROC curve with one point on it which kind of defies the point?

Gschu answered 17/2, 2014 at 12:28 Comment(2)
Yes. You shouldn't do that. Maybe open a PR changing the docstring and saying that that is not very advisable.Amylolysis
Done: github.com/scikit-learn/scikit-learn/pull/2874 :)Gschu
R
4

If you are using a classifier that does not output probability scores (e.g. svm.SVC without an explicit probability=True), there isn't a way to compute a ROC curve. As an API designer, you have two choices: raise an exception and provide the user no useful information, or plot a degenerate curve with one data point. I would argue the latter is more useful.

Reunite answered 18/2, 2014 at 11:4 Comment(1)
We actually had a student who generated those degenerate ROC curves, calculated AUC and thought everything is all right. I would lean towards raising an exception.Gschu

© 2022 - 2024 — McMap. All rights reserved.