ValueError: continuous format is not supported
Asked Answered
M

3

26

I have written a simple function where I am using the average_precision_score from scikit-learn to compute average precision.

My Code:

def compute_average_precision(predictions, gold):
    gold_predictions = np.zeros(predictions.size, dtype=np.int)
    for idx in range(gold):
        gold_predictions[idx] = 1
    return average_precision_score(predictions, gold_predictions)

When the function is executed, it produces the following error.

Traceback (most recent call last):
  File "test.py", line 91, in <module>
    total_avg_precision += compute_average_precision(np.asarray(probs), len(gold_candidates))
  File "test.py", line 29, in compute_average_precision
    return average_precision_score(predictions, gold_predictions)
  File "/if5/wua4nw/anaconda3/lib/python3.5/site-packages/sklearn/metrics/ranking.py", line 184, in average_precision_score
    average, sample_weight=sample_weight)
  File "/if5/wua4nw/anaconda3/lib/python3.5/site-packages/sklearn/metrics/base.py", line 81, in _average_binary_score
    raise ValueError("{0} format is not supported".format(y_type))
ValueError: continuous format is not supported

If I print the two numpy arrays predictions and gold_predictions, say for one example, it looks alright. [One example is provided below.]

[ 0.40865014  0.26047812  0.07588802  0.26604077  0.10586583  0.17118802
  0.26797949  0.34618672  0.33659923  0.22075308  0.42288553  0.24908153
  0.26506338  0.28224747  0.32942101  0.19986877  0.39831917  0.23635269
  0.34715138  0.39831917  0.23635269  0.35822859  0.12110706]
[1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

What I am doing wrong here? What is the meaning of the error?

Mawson answered 10/6, 2017 at 0:4 Comment(1)
What does this predictions represent? Are they outputs of the predict() method of some estimator or do they represent the probability of getting the positive class, or maybe output of predict_proba()? Anyways, y_true or your gold_predictions need to be the first argument and predictions second.Guzel
M
31

Just taking a look at the sklearn docs

Parameters:

y_true : array, shape = [n_samples] or [n_samples, n_classes] True binary labels in binary label indicators.

y_score : array, shape = [n_samples] or [n_samples, n_classes] Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers).

So your first argument has to be an array of binary labels, but you are passing some sort of float array as the first argument. So I believe you need to reverse the order of the arguments you are passing.

Matti answered 10/6, 2017 at 0:22 Comment(0)
U
0

Many of the metrics in scikit-learn work on only specific types of target data. Scikit-learn employs a utility function sklearn.utils.multiclass.type_of_target, to check the type of the target data. The following are the possible types:

  • continuous, e.g. np.random.rand(100)
  • continuous-multioutput, e.g. np.random.rand(100,2)
  • binary, e.g. np.random.choice([0, 1], size=100)
  • multiclass, e.g. np.random.choice([0, 1, 2], size=100)
  • multiclass-multioutput, e.g. np.random.choice([0, 1, 2], size=(100,2))
  • multilabel-indicator, e.g. np.random.choice([0, 1], size=(100,2))
  • unknown, e.g. np.random.rand(100).astype(object)

The first argument passed to any metric function determines the type of target data. So in the example in the OP, the internal type checking is done on predictions variable, such as the following.

from sklearn.utils.multiclass import type_of_target
type_of_target(predictions)   # 'continuous'

The most common way this error occurs is when one passes an unsupported target type (perhaps ordered args incorrectly as in the OP) when using a metric that assesses performance on a classification task given scores. The following table summarizes all supported types of such metrics.

continuous continuous-
multioutput
binary multiclass multiclass-
multioutput
multilabel-
indicator
unknown
average precision
score
coverage
error
dcg score
det curve
label ranking
average precision
score
label ranking
loss
ndcg score
precision recall
curve
roc auc score multi_class=
must be passed
roc curve
top-k
accuracy score

There are also metrics that assess performance on classification task given class prediction. They generally work on binary, multiclass or multilabel-indicator target types. If a "wrong" target type is fed to it, a related ValueError: Unknown label type or ValueError: y should be a 1d array errors are thrown. The following summarizes which metric admits which target type.

Yet another adjacent error is ValueError: Classification metrics can't handle a mix of target types. This occurs when the type of predictions don't match the type of true values, i.e. it happens when type_of_target(y_true) != type_of_target(y_pred). Make sure they are the same.


Yet another way this error occurs is if you create a custom scorer using sklearn.metrics.make_scorer with needs_threshold=True. In that case, only binary or multilabel-indicator target types are accepted, even if the underlying metric passed to the scorer works on another target type. For example, sklearn.metrics.top_k_accuracy_score works on multiclass target types but if it is made into a scorer via metrics.make_scorer that needs a threshold, then it wouldn't work anymore.

import numpy as np
from sklearn import linear_model, metrics, datasets

X, y = datasets.make_classification(n_informative=3, n_classes=3)   # multiclass
lr = linear_model.LogisticRegression()
lr.fit(X, y)

metrics.top_k_accuracy_score(y, lr.decision_function(X))   # <--- OK

scorer = metrics.make_scorer(metrics.top_k_accuracy_score, needs_threshold=True)
scorer(lr, X, y)                                           # <--- ValueError: multiclass format is not supported
Unfetter answered 24/5, 2023 at 20:40 Comment(0)
I
0

I appreciate all previous two detailed answers. But the solution seems so easy: you just need to exchange the positions of two inputs...

That is, average_precision_score(gold_predictions, predictions)

enter image description here

Indo answered 19/12, 2023 at 3:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.