How to extract the splitting rules for the terminal nodes of ctree()

About

Asked 2/5, 2015 at 7:27 Answered 2/5, 2015 at 8:17

I have a data set with 6 categorical variables with levels ranging from 5 to 28. I have obtained an output from ctree() (party package) with 17 terminal nodes. I have followed the inputs by @Galled from ctree() - How to get the list of splitting conditions for each terminal node? to arrive at my desired output.

But, I'm getting the following error post running the code:

Error in data.frame(ResulTable, Means, Counts) : 
  arguments imply differing number of rows: 17, 2

I have tried adding this extra lines:

ResulTable <- rbind(ResulTable, cbind(Node = Node, Path = Path2))

ResulTable$Node <- rownames(ResulTable)

melt(ResulTable)

but no success so far. Any pointers on where it is going wrong?

Zielinski answered 2/5, 2015 at 7:27 Comment(0)

I would recommend to use the new partykit implementation of ctree() rather than the old party package, then you can use the function .list.rules.party(). This is not officially exported, yet, but can be leveraged to extract the desired information.

library("partykit")
airq <- subset(airquality, !is.na(Ozone))
ct <- ctree(Ozone ~ ., data = airq)
partykit:::.list.rules.party(ct)
##                                      3                                      5 
##             "Temp <= 82 & Wind <= 6.9" "Temp <= 82 & Wind > 6.9 & Temp <= 77" 
##                                      6                                      8 
##  "Temp <= 82 & Wind > 6.9 & Temp > 77"             "Temp > 82 & Wind <= 10.3" 
##                                      9 
##              "Temp > 82 & Wind > 10.3"

Cand answered 2/5, 2015 at 8:17 Comment(4)

Thank you for your prompt reply. With the above code, I'm getting this error: Error in UseMethod("nodeids") : no applicable method for 'nodeids' applied to an object of class "c('BinaryTree', 'BinaryTreePartition')" – Zielinski 2/5, 2015 at 8:36

Then you have fitted your tree with party::ctree not with partykit::ctree. Make sure that you do not load both packages simultaneously. This wis bound to lead to confusion... – Cand 2/5, 2015 at 8:42

Running ctree with partykit package (with the default control parameters) is taking an indefinite time as compared to running ctree with party package which was much faster. I have a dataset with 100K rows and 6 columns. I'm running R version 3.1.3 on a 32-bit 64 GB machine. Any inputs on this? – Zielinski 4/5, 2015 at 5:21

The old party implementation could run into numerical problems when comparing p-values from datasets with hundreds of thousands of observations. The new partykit implementation uses log-p-values instead which is numerically more stable. For your data this appears to lead to differences in the splitting with partykit continuing longer. I would recommend to not use the default values only but restrict mincriterion, minbucket, or maxdepth to values that are better suited for your data. – Cand 4/5, 2015 at 17:3

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags