Understanding the k lag in R’s augmented Dickey Fuller test

I played around with some unit root testing in R and I am not entirely sure what to make of the k lag parameter. I used the augmented Dickey Fuller test and the Philipps Perron test from the tseries package. Obviously the default k parameter (for the adf.test) depends only on the length of the series. If I choose different k-values I get pretty different results wrt. rejecting the null:

Dickey-Fuller = -3.9828, Lag order = 4, p-value = 0.01272
alternative hypothesis: stationary 
# 103^(1/3)=k=4 


Dickey-Fuller = -2.7776, Lag order = 0, p-value = 0.2543
alternative hypothesis: stationary
# k=0

Dickey-Fuller = -2.5365, Lag order = 6, p-value = 0.3542
alternative hypothesis: stationary
# k=6

plus the PP test result:

Dickey-Fuller Z(alpha) = -18.1799, Truncation lag parameter = 4, p-value = 0.08954
alternative hypothesis: stationary 

Looking at the data, I would think the underlying data is non-stationary, but still I do not consider these results a strong backup, in particular since I do not understand the role of the k parameter. If I look at decompose / stl I see that the trend has strong impact as opposed to only small contribution from remainder or seasonal variation. My series is of quarterly frequency.

Any hints?

Answer

It’s been a while since I looked at ADF tests, however I do remember at least two versions of the adf test.

http://www.stat.ucl.ac.be/ISdidactique/Rhelp/library/tseries/html/adf.test.html

http://cran.r-project.org/web/packages/fUnitRoots/

The fUnitRoots package has a function called adfTest(). I think the “trend” issue is handled differently in those packages.

Edit —— From page 14 of the following link, there were 4 versions (uroot discontinued) of the adf test:

http://math.uncc.edu/~zcai/FinTS.pdf

One more link. Read section 6.3 in the following link. It does a far btter job than I could do in explaining the lag term:

http://www.yats.com/doc/cointegration-en.html

Also, I would be careful with any seasonal model. Unless you’re sure there’s some seasonality present, I would avoid using seasonal terms. Why? Anything can be broken down into seasonal terms, even if it’s not. Here are two examples:

#First example: White noise
x <- rnorm(200)

#Use stl() to separate the trend and seasonal term
x.ts <- ts(x, freq=4) 
x.stl <- stl(x.ts, s.window = "periodic")
plot(x.stl)

#Use decompose() to separate the trend and seasonal term
x.dec <- decompose(x.ts)
plot(x.dec)

#===========================================

#Second example, MA process
x1 <- cumsum(x)

#Use stl() to separate the trend and seasonal term
x1.ts <- ts(x1, freq=4)
x1.stl <- stl(x1.ts, s.window = "periodic")
plot(x1.stl)

#Use decompose() to separate the trend and seasonal term
x1.dec <- decompose(x1.ts)
plot(x1.dec)

The graph below is from the above plot(x.stl) statement. stl() found a small seasonal term in white noise. You might say that term is so small that it’s really not an issue. The problem is, in real data, you don’t know if that term is a problem or not. In the example below, notice that the trend data series has segments where it looks like a filtered version of the raw data, and other segments where it might be considered significantly different than the raw data.

enter image description here

Attribution
Source : Link , Question Author : hans0l0 , Answer Author : bill_080

Leave a Comment