I have this dataset with target LULUS
, it's an imbalance dataset. I'm trying to print roc auc
score if I could for each fold of my data but in every fold somehow it's always raise error saying ValueError: y should be a 1d array, got an array of shape (15, 2) instead.
. I'm kind of confused which part I did wrong because I do it exactly like in the documentation. And in several fold, I get it that It won't print the score if there's only one label but then it will return the second type of error about 1d array.
merged_df = pd.read_csv(r'C:\...\merged.csv')
num_columns = merged_df.select_dtypes(include=['float64']).columns
cat_columns = merged_df.select_dtypes(include=['object']).drop(['TARGET','NAMA'], axis=1).columns
numeric_transformer = Pipeline(steps=[
('imputer', SimpleImputer(strategy='mean')),
('scaler', StandardScaler())])
categorical_transformer = Pipeline(steps=[
('imputer', SimpleImputer(strategy='most_frequent')),
('label', OneHotEncoder(handle_unknown='ignore'))])
preprocessor = ColumnTransformer(
transformers=[
('num', numeric_transformer, num_columns),
('cat', categorical_transformer, cat_columns)])
X = merged_df.drop(['TARGET','Unnamed: 0'],1)
y = merged_df['TARGET']
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)
X_train = X_train.drop(['NIM', 'NAMA'],1)
X_test = X_test.drop(['NIM', 'NAMA'],1)
rf = Pipeline(steps=[('preprocessor', preprocessor),
('classifier',tree.DecisionTreeClassifier(class_weight='balanced', criterion='entropy'))])
rf.fit(X_train, y_train)
pred = rf.predict(X_test)
y_proba = rf.predict_proba(X_test)
from sklearn.model_selection import KFold
kf = KFold(n_splits=10)
for train, test in kf.split(X):
X_train, X_test = X.loc[train], X.loc[test]
y_train, y_test = y.loc[train], y.loc[test]
model = rf.fit(X_train, y_train)
y_proba = model.predict_proba(X_test)
try:
print(roc_auc_score(y_test, y_proba,average='weighted', multi_class='ovr'))
except ValueError:
pass
See my data in spreadsheet
roc_auc_score(y_test, y_proba)
but it still works.. why is that? sorry, I'm not familiar with roc auc score.. – Antwanantwerp