I am trying to figure out how R computes lag-k autocorrelation (apparently, it is the same formula used by Minitab and SAS), so that I can compare it to using Excel’s CORREL function applied to the series and its k-lagged version. R and Excel (using CORREL) give slightly different autocorrelation values.

I’d also be interested to find out whether one computation is more correct than the other.

**Answer**

The exact equation is given in: Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer-Verlag. I’ll give you an example:

```
### simulate some data with AR(1) where rho = .75
xi <- 1:50
yi <- arima.sim(model=list(ar=.75), n=50)
### get residuals
res <- resid(lm(yi ~ xi))
### acf for lags 1 and 2
cor(res[1:49], res[2:50]) ### not quite how this is calculated by R
cor(res[1:48], res[3:50]) ### not quite how this is calculated by R
### how R calculates these
acf(res, lag.max=2, plot=F)
### how this is calculated by R
### note: mean(res) = 0 for this example, so technically not needed here
c0 <- 1/50 * sum( (res[1:50] - mean(res)) * (res[1:50] - mean(res)) )
c1 <- 1/50 * sum( (res[1:49] - mean(res)) * (res[2:50] - mean(res)) )
c2 <- 1/50 * sum( (res[1:48] - mean(res)) * (res[3:50] - mean(res)) )
c1/c0
c2/c0
```

And so on (e.g., `res[1:47]`

and `res[4:50]`

for lag 3).

**Attribution***Source : Link , Question Author : Galit Shmueli , Answer Author : Wolfgang*