The loss function and evaluation metric of XGBoost

Asked 29/11, 2018 at 0:38 Answered 29/11, 2018 at 9:30

python machine-learning xgboost xgbclassifier

I am confused now about the loss functions used in XGBoost. Here is how I feel confused:

we have objective, which is the loss function needs to be minimized; eval_metric: the metric used to represent the learning result. These two are totally unrelated (if we don't consider such as for classification only logloss and mlogloss can be used as eval_metric). Is this correct? If I am, then for a classification problem, how you can use rmse as a performance metric?
take two options for objective as an example, reg:logistic and binary:logistic. For 0/1 classifications, usually binary logistic loss, or cross entropy should be considered as the loss function, right? So which of the two options is for this loss function, and what's the value of the other one? Say, if binary:logistic represents the cross entropy loss function, then what does reg:logistic do?
what's the difference between multi:softmax and multi:softprob? Do they use the same loss function and just differ in the output format? If so, that should be the same for reg:logistic and binary:logistic as well, right?

supplement for the 2nd problem

say, the loss function for 0/1 classification problem should be L = sum(y_i*log(P_i)+(1-y_i)*log(P_i)). So if I need to choose binary:logistic here, or reg:logistic to let xgboost classifier to use L loss function. If it is binary:logistic, then what loss function reg:logistic uses?

Earthshaker answered 29/11, 2018 at 0:38 Comment(4)

#48281373 – Hyaluronidase 29/11, 2018 at 0:45

@JoshuaCook, it explains the first question, with Keras. – Earthshaker 29/11, 2018 at 0:54

Yes, but your first question is conceptual in nature and not specific to library. – Hyaluronidase 29/11, 2018 at 1:12

reg:logistic is usually calculating cost function as (y - y_pred)^2 and average over sample dimension. – Hedger 29/11, 2018 at 9:26

'binary:logistic' uses -(y*log(y_pred) + (1-y)*(log(1-y_pred)))

'reg:logistic' uses (y - y_pred)^2

To get a total estimation of error we sum all errors and divide by number of samples.

You can find this in the basics. When looking on Linear regression VS Logistic regression.

Linear regression uses (y - y_pred)^2 as the Cost Function

Logistic regression uses -(y*log(y_pred) + (y-1)*(log(1-y_pred))) as the Cost function

Evaluation metrics are completely different thing. They design to evaluate your model. You can be confused by them because it is logical to use some evaluation metrics that are the same as the loss function, like MSE in regression problems. However, in binary problems it is not always wise to look at the logloss. My experience have thought me (in classification problems) to generally look on AUC ROC.

EDIT

according to xgboost documentation:

reg:linear: linear regression

reg:logistic: logistic regression

binary:logistic: logistic regression for binary classification, output probability

So I'm guessing:

reg:linear: is as we said, (y - y_pred)^2

reg:logistic is -(y*log(y_pred) + (y-1)*(log(1-y_pred))) and rounding predictions with 0.5 threshhold

binary:logistic is plain -(y*log(y_pred) + (1-y)*(log(1-y_pred))) (returns the probability)

You can test it out and see if it do as I've edited. If so, I will update the answer, otherwise, I'll just delete it :<

Hedger answered 29/11, 2018 at 9:30 Comment(5)

Thanks for this reply. Sounds like reg:logistic uses rmse as the loss (cost) function, which is more intuitive in reg:linear. I don't get why in logistic regression, why the rmse are still used, such that y-y_pred equals 1, 0, -1. – Earthshaker 29/11, 2018 at 15:3

@BsHe Mate I think I'm mistaken. I will edit, you can check it and if it so, I'll fix the answer – Hedger 29/11, 2018 at 15:39

I think we should keep the edit; the package author's answer seems verifies this github.com/dmlc/xgboost/issues/521#issuecomment-144453618. But I will wait to see if others give more solid answers. – Earthshaker 29/11, 2018 at 16:44

@EranMoshe can you please confirm if the second half of logloss is y-1 ? I think it should be 1-y – First 4/1, 2021 at 12:52

@First Sorry mate. It should be 1-y (because we know y is in [0, 1)). – Hedger 24/1, 2021 at 8:46

Yes, a loss function and evaluation metric serve two different purposes. The loss function is used by the model to learn the relationship between input and output. The evaluation metric is used to assess how good the learned relationship is. Here is a link to a discussion of model evaluation: https://scikit-learn.org/stable/modules/model_evaluation.html
I'm not sure exactly what you are asking here. Can you clarify this question?

Hyaluronidase answered 29/11, 2018 at 0:43 Comment(3)

I added some supplement for the 2nd question, thanks. – Earthshaker 29/11, 2018 at 0:54

Still not following your question. What context are you asking these in? – Hyaluronidase 29/11, 2018 at 1:13

let me make it simple. What loss function does objective:'binary:logistic' use, and what for objective:'reg:logistic' – Earthshaker 29/11, 2018 at 1:25

EDIT

Recommended topics

Hot tags