I have a classification problem where I have the pixels values of an 8x8 image and the number the image represents and my task is to predict the number('Number' attribute) based on the pixel values using RandomForestClassifier. The values of the number values can be 0-9.
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import roc_auc_score
forest_model = RandomForestClassifier(n_estimators=100, random_state=42)
forest_model.fit(train_df[input_var], train_df[target])
test_df['forest_pred'] = forest_model.predict_proba(test_df[input_var])[:,1]
roc_auc_score(test_df['Number'], test_df['forest_pred'], average = 'macro', multi_class="ovr")
Here it throws an AxisError.
Traceback (most recent call last): File "dap_hazi_4.py", line 44, in roc_auc_score(test_df['Number'], test_df['forest_pred'], average = 'macro', multi_class="ovo") File "/home/balint/.local/lib/python3.6/site-packages/sklearn/metrics/_ranking.py", line 383, in roc_auc_score multi_class, average, sample_weight) File "/home/balint/.local/lib/python3.6/site-packages/sklearn/metrics/_ranking.py", line 440, in _multiclass_roc_auc_score if not np.allclose(1, y_score.sum(axis=1)): File "/home/balint/.local/lib/python3.6/site-packages/numpy/core/_methods.py", line 38, in _sum return umr_sum(a, axis, dtype, out, keepdims, initial, where) AxisError: axis 1 is out of bounds for array of dimension 1
sklearn.model_selection.cross_validate
and similar and this error appears you just need to setneeds_proba=True
inmake_scorer(roc_auc_score, multi_class='ovo', needs_proba=True)
– Viipuri