How to get an R-squared for a loess fit?

How to calculate R-squared (r2) statistic in R for loess and/or predict function output?
For example for this data:

cars.lo <- loess(dist ~ speed, cars)
cars.lp <- predict(cars.lo, data.frame(speed = seq(5, 30, 1)), se = TRUE)

cars.lp has two arrays fit for model and se.fit for standard error.

Answer

My first thought was to compute a pseudo R2 measure as follows:

ss.dist <- sum(scale(cars$dist, scale=FALSE)^2)
ss.resid <- sum(resid(cars.lo)^2)
1-ss.resid/ss.dist

Here, we get a value of 0.6814984 ( cor(cars$dist, predict(cars.lo))^2), close to what would be obtained from a GAM:

library(mgcv)
summary(gam(dist ~ speed, data=cars))

This also seems to be in agreement with what S loess function would return (I don’t have S so I can’t check by myself) as Multiple R-squared. For example, using the airquality R dataset, which looks like the air data Chambers and Hastie used in the ‘white book’ (the one that is being referenced in the on-line help for loess; but that’s not the exact same dataset), I got an R2 of 0.8101377 using the above formula. That’s pretty in agreement with what Chambers and Hastie reported.

enter image description here

I should note that I didn’t find any paper dealing specifically with that (ok, that was just a quick googling), and William Cleveland doesn’t speak about R2-like measure in his paper.

However, I wonder if the liberty with which you can choose the degree of smoothing (or window span) does not preclude any use of R2-based measure.

Attribution
Source : Link , Question Author : Yuriy Petrovskiy , Answer Author : chl

Leave a Comment