num_leaves selection in LightGBM?
Asked Answered
S

1

7

Is there any rule of thumb to initialize the num_leaves parameter in lightgbm. For example for 1000 featured dataset, we know that with tree-depth of 10, it can cover the entire dataset, so we can choose this accordingly, and search space for tuning also get limited.

But in lightgbm, how we can roughly guess this parameters, otherwise its search space will be pretty large while using grid-search method.

Any intuition on selecting this parameters will be helpful.

Step answered 8/3, 2019 at 10:18 Comment(0)
I
9

The best recommendation, that I bumped into is this awesome summary by Laurae on lightgbm github. As always, this very much depends on your data.

My personal rule of thumb based on limited kaggle experience is to start by trying values in the range [10,100]. But if you have a solid heuristic to choose tree depth you can always use it and set num_leaves to 2^tree_depth - 1

Istic answered 9/3, 2019 at 11:30 Comment(7)
why -1, why not just 2^tree_depth?Valor
The tree of depth one has 1 leaf/node, depth two- (1+2) leaves, depth three - (1+2+4). The rest you get by inductionIstic
I think leaf should mean terminal nodes, so tree of depth n could have 2^n leaves/terminal-nodes,, and 2^n - 1 non-terminal nodes. lightgbm.readthedocs.io/en/latest/Parameters-Tuning.html confirms my understanding. But I'm not sure of the intuition behind 2^n - 1 num_leaves.Valor
I fail to see where does the page confirm the assumption that leaves are terminal nodes. The number of leaves in the layer n of a tree is 2^(n-1), but this does not relate to num_leavesIstic
For "the assumption that leaves are terminal nodes", I'm looking at the line Theoretically, we can set num_leaves = 2^(max_depth) to obtain the same number of leaves as depth-wise tree.. I think for a tree of depth one, it has one split or non-terminal node, and two leaves/terminal nodes. That saying tree of depth one has 1 leaf/node is NOT correct.Valor
Ah, i think they have not been explicit in that statement. You can see at the end of that paragraph, that they actually quote the value 2**7-1 for the example that they give. Throughout the docs you will see that the maximum value is odd and not even. One other example is the parameter docs by Laurae: sites.google.com/view/lauraepp/parameters -> Maximum leaves: "On LightGBM, the maximum leaves must be tuned with the maximum depth together. To get xgboost behavior, set the maximum leaves to 2^depth - 1." You can verify your hypothesis by building a tree of depth 1 and plot itIstic
31 is random, see this issueValor

© 2022 - 2024 — McMap. All rights reserved.