Is it common to have a different regression technique at the leaves of a regression tree (for instance linear regression)? I’ve been searching for it for the past hour but all I find are implementations that have a constant value at the trees’ leafs. Is there a reason why this is/is not common?
Answer
There has been quite some research on this topic over the last decades, starting with the pioneering efforts of Ciampi, followed by Loh’s GUIDE, and then also Gama’s functional trees or the modelbased recursive partitioning approach by us. A nice overview is given in @Momo’s answer to this question: Advantage of GLMs in terminal nodes of a regression tree?
Corresponding software is less widely used than simple constantfit trees as you observe. Part of the reason for this is presumably that it is more difficult to write – but also more difficult to use. It just requires more specifications than a simple CART model. But software is available (as previously pointed out here by @marqram or @Momo at: Regression tree algorithm with linear regression models in each leaf). Prominent software packages include:

In the Weka suite there are
M5P
(M5′) for continuous responses,LMT
(logistic model trees) for binary responses, andFT
(functional trees) for categorical responses. See http://www.cs.waikato.ac.nz/~ml/weka/ for more details. The former two functions are also easily interfaced through the R packageRWeka
. 
Loh’s GUIDE implementation is available in binary form at no cost (but without source code) from http://www.stat.wisc.edu/~loh/guide.html. It allows to modify the details of the method by a wide range of control options.

Our MOB (MOdelBased recursive partitioning) algorithm is available in the R package
partykit
(successor to theparty
implementation). Themob()
function gives you a general framework, allowing you to specify new models that can be easily fitted in the nodes/leaves of the tree. Convenience interfaceslmtree()
andglmtree()
that combinemob()
withlm()
andglm()
are directly available and illustrated invignette("mob", package = "partykit")
. But other plugins can also be defined. For example, in https://stackoverflow.com/questions/37037445/usingmobtreespartykitpackagewithnlsmodelmob()
is combined withnls()
. But there are also “mobsters” for various psychometric models (inpsychotree
) and for beta regression (inbetareg
).
Attribution
Source : Link , Question Author : marqram , Answer Author : Community