Thinking about using recurrent neural networks for time series forecasting.
They basically implement a sort of generalized non-linear auto-regression, compared to ARMA and ARIMA models which use linear auto-regression.
If we are performing non-linear auto-regression, is it still necessary for the time series to be stationary and would we need to perform differencing the way we do in ARIMA models?
Or does the non-linear character of the model give it the ability to handle non stationary time series?
To put the question another way: Is the stationarity requirement (in the mean and variance) for ARMA and ARIMA models due to the fact that these models are linear, or is it because of something else?
If the purpose of your model is prediction and forecasting, then the short answer is YES, but the stationarity doesn’t need to be on levels.
I’ll explain. If you boil down forecasting to its most basic form, it’s going to be extraction of the invariant. Consider this: you cannot forecast what’s changing. If I tell you tomorrow is going to be different than today in every imaginable aspect, you will not be able to produce any kind of forecast.
It’s only when you’re able to extend something from today to tomorrow, you can produce any kind of a prediction. I’ll give you a few examples.
- You know that the distribution of the tomorrow’s average temperature is going to be about the same as today. In this case, you can take today’s temperature as your prediction for tomorrow, the naive forecast ˆxt+1=xt
- You observe a car at mile 10 on a road driving at the rate of speed v=60 mph. In a minute it’s probably going to be around mile 11 or 9. If you know that it’s driving toward mile 11, then it’s going to be around mile 11. Given that its speed and direction are constant. Note, that the location is not stationary here, only the rate of speed is. In this regard it’s analogous to a difference model like ARIMA(p,1,q) or a constant trend model like xt∼vt
- Your neighbor is drunk every Friday. Is he going to be drunk next Friday? Yes, as long as he doesn’t change his behavior
- and so on
In every case of a reasonable forecast, we first extract something that is constant from the process, and extend it to future. Hence, my answer: yes, the time series need to be stationary if variance and mean are the invariants that you are going to extend into the future from history. Moreover, you want the relationships to regressors to be stable too.
Simply identify what is an invariant in your model, whether it’s a mean level, a rate of change or something else. These things need to stay the same in future if you want your model to have any forecasting power.
Holt Winters Example
Holt Winters filter was mentioned in the comments. It’s a popular choice for smoothing and forecasting certain kinds of seasonal series, and it can deal with nonstationary series. Particularly, it can handle series where the mean level grows linearly with time. In other words where the slope is stable. In my terminology the slope is one of the invariants that this approach extracts from the series. Let’s see how it fails when the slope is unstable.
In this plot I’m showing the deterministic series with exponential growth and additive seasonality. In other words, the slope keeps getting steeper with time:
You can see how filter seems to fit the data very well. The fitted line is red. However, if you attempt to predict with this filter, it fails miserably. The true line is black, and the red if fitted with blue confidence bounds on the next plot:
The reason why it fails is easy to see by examining Holt Winters model equations. It extracts the slope from past, and extends to future. This works very well when the slope is stable, but when it is consistently growing the filter can’t keep up, it’s one step behind and the effect accumulates into an increasing forecast error.
t=1:150 a = 0.04 x=ts(exp(a*t)+sin(t/5)*sin(t/2),deltat = 1/12,start=0) xt = window(x,0,99/12) plot(xt) (m <- HoltWinters(xt)) plot(m) plot(fitted(m)) xp = window(x,8.33) p <- predict(m, 50, prediction.interval = TRUE) plot(m, p) lines(xp,col="black")
In this example you may be able to improve filter performance by simply taking a log of series. When you take a logarithm of exponentially growing series, you make its slope stable again, and give this filter a chance. Here’s example:
t=1:150 a = 0.1 x=ts(exp(a*t)+sin(t/5)*sin(t/2),deltat = 1/12,start=0) xt = window(log(x),0,99/12) plot(xt) (m <- HoltWinters(xt)) plot(m) plot(fitted(m)) p <- predict(m, 50, prediction.interval = TRUE) plot(m, exp(p)) xp = window(x,8.33) lines(xp,col="black")