# How to get an R-squared for a loess fit?

How to calculate R-squared ($r^2$) statistic in R for loess and/or predict function output?
For example for this data:

cars.lo <- loess(dist ~ speed, cars)
cars.lp <- predict(cars.lo, data.frame(speed = seq(5, 30, 1)), se = TRUE)


cars.lp has two arrays fit for model and se.fit for standard error.

My first thought was to compute a pseudo $R^2$ measure as follows:

ss.dist <- sum(scale(cars$dist, scale=FALSE)^2) ss.resid <- sum(resid(cars.lo)^2) 1-ss.resid/ss.dist  Here, we get a value of 0.6814984 ($\approx$ cor(cars$dist, predict(cars.lo))^2), close to what would be obtained from a GAM:

library(mgcv)
summary(gam(dist ~ speed, data=cars))


This also seems to be in agreement with what S loess function would return (I don’t have S so I can’t check by myself) as Multiple R-squared. For example, using the airquality R dataset, which looks like the air data Chambers and Hastie used in the ‘white book’ (the one that is being referenced in the on-line help for loess; but that’s not the exact same dataset), I got an $R^2$ of 0.8101377 using the above formula. That’s pretty in agreement with what Chambers and Hastie reported.

I should note that I didn’t find any paper dealing specifically with that (ok, that was just a quick googling), and William Cleveland doesn’t speak about $R^2$-like measure in his paper.

However, I wonder if the liberty with which you can choose the degree of smoothing (or window span) does not preclude any use of $R^2$-based measure.