Can a Random Forest be trained to appropriately predict count data?
How would this proceed? I have quite a extensive range of values so classification doesn’t really make sense. If I would use regression would I simply truncate the results?
I’m quite lost here. Any ideas?
There is a R package called
mobForest which can fit a real random forest for count data. It is based on
mod() (model-based recursive partitioning) in the
party package. It performs Poisson regression if the
family argument is specified as
poisson(). The package is no longer in the CRAN repository, but formerly available versions can be obtained from the archive.
If you are not restricted to random forest / bagging, a boosting version is also available for count data. That is,
gbm (generalized boosted regression models). It can also fit a Poisson model.