How to calculate R-squared (r2) statistic in R for

`loess`

and/or`predict`

function output?

For example for this data:`cars.lo <- loess(dist ~ speed, cars) cars.lp <- predict(cars.lo, data.frame(speed = seq(5, 30, 1)), se = TRUE)`

`cars.lp`

has two arrays`fit`

for model and`se.fit`

for standard error.

**Answer**

My first thought was to compute a pseudo R2 measure as follows:

```
ss.dist <- sum(scale(cars$dist, scale=FALSE)^2)
ss.resid <- sum(resid(cars.lo)^2)
1-ss.resid/ss.dist
```

Here, we get a value of 0.6814984 (≈ `cor(cars$dist, predict(cars.lo))^2`

), close to what would be obtained from a GAM:

```
library(mgcv)
summary(gam(dist ~ speed, data=cars))
```

This also seems to be in agreement with what S `loess`

function would return (I don’t have S so I can’t check by myself) as `Multiple R-squared`

. For example, using the `airquality`

R dataset, which looks like the `air`

data Chambers and Hastie used in the ‘white book’ (the one that is being referenced in the on-line help for `loess`

; but that’s not the exact same dataset), I got an R2 of 0.8101377 using the above formula. That’s pretty in agreement with what Chambers and Hastie reported.

I should note that I didn’t find any paper dealing specifically with that (ok, that was just a quick googling), and William Cleveland doesn’t speak about R2-like measure in his paper.

However, I wonder if the liberty with which you can choose the degree of smoothing (or window `span`

) does not preclude any use of R2-based measure.

**Attribution***Source : Link , Question Author : Yuriy Petrovskiy , Answer Author : chl*