I am confused now about the loss functions used in XGBoost
. Here is how I feel confused:
- we have
objective
, which is the loss function needs to be minimized;eval_metric
: the metric used to represent the learning result. These two are totally unrelated (if we don't consider such as for classification onlylogloss
andmlogloss
can be used aseval_metric
). Is this correct? If I am, then for a classification problem, how you can usermse
as a performance metric? - take two options for
objective
as an example,reg:logistic
andbinary:logistic
. For 0/1 classifications, usually binary logistic loss, or cross entropy should be considered as the loss function, right? So which of the two options is for this loss function, and what's the value of the other one? Say, ifbinary:logistic
represents the cross entropy loss function, then what doesreg:logistic
do? - what's the difference between
multi:softmax
andmulti:softprob
? Do they use the same loss function and just differ in the output format? If so, that should be the same forreg:logistic
andbinary:logistic
as well, right?
supplement for the 2nd problem
say, the loss function for 0/1 classification problem should be
L = sum(y_i*log(P_i)+(1-y_i)*log(P_i))
. So if I need to choose binary:logistic
here, or reg:logistic
to let xgboost classifier to use L
loss function. If it is binary:logistic
, then what loss function reg:logistic
uses?