I’ve read some literature that random forests can’t overfit. While this sounds great, it seems too good to be true. Is it possible for rf’s to overfit?
Random forest can overfit. I am sure of this. What is usually meant is that the model would not overfit if you use more trees.
Try for example to estimate the model y=log(x)+ϵ with a random forest. You will get an almost zero training error but a bad prediction error