From what I understand, the cp argument to the

`rpart`

function helps pre-prune the tree in the same way as the minsplit or minbucket arguments. What I don’t understand is how CP values are computed. For example`df<-data.frame(x=c(1,2,3,3,3,4), y=as.factor(c(TRUE, TRUE, FALSE, TRUE, FALSE, FALSE)), method="class") mytree<-rpart(y ~ x, data = df, minbucket = 1, minsplit=1)`

Resulting tree…

`mytree n= 6 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 6 3 FALSE (0.5000000 0.5000000) 2) x>=2.5 4 1 FALSE (0.7500000 0.2500000) * 3) x< 2.5 2 0 TRUE (0.0000000 1.0000000) *`

Summary…

`summary(mytree) Call: rpart(formula = y ~ x, data = df, minbucket = 1, minsplit = 1) n= 6 CP nsplit rel error xerror xstd 1 0.6666667 0 1.0000000 2.0000000 0.0000000 2 0.0100000 1 0.3333333 0.6666667 0.3849002`

Where’s the .666 and .01 coming from?

**Answer**

I was searching for same from many days and I came to know one thing that cp value calculation is taken care by package.

By default if you do not specify “CP” value then rpart will take its as 0.01.

Cp value is cost of adding node to the tree.

**Attribution***Source : Link , Question Author : Ben , Answer Author : Nikhil*