What is the difference between xgboost, extratreeclassifier, and randomforrestclasiffier?
Asked Answered
C

1

8

I am new to all these methods and am trying to get a simple answer to that or perhaps if someone could direct me to a high level explanation somewhere on the web. My googling only returned kaggle sample codes.

Are the extratree and randomforrest essentially the same? And xgboost uses boosting when it chooses the features for any particular tree i.e. sampling the features. But then how do the other two algorithms select the features?

Thanks!

Caylacaylor answered 6/2, 2016 at 3:34 Comment(0)
A
9

Extra-trees(ET) aka. extremely randomized trees is quite similar to random forest (RF). Both methods are bagging methods aggregating some fully grow decision trees. RF will only try to split by e.g. a third of features, but evaluate any possible break point within these features and pick the best. However, ET will only evaluate a random few break points and pick the best of these. ET can bootstrap samples to each tree or use all samples. RF must use bootstrap to work well.

xgboost is an implementation of gradient boosting and can work with decision trees, typical smaller trees. Each tree is trained to correct the residuals of previous trained trees. Gradient boosting can be more difficult to train, but can achieve a lower model bias than RF. For noisy data bagging is likely to be most promising. For low noise and complex data structures boosting is likely to be most promising.

Aim answered 7/2, 2016 at 0:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.