I’m experiencing an issue in which it seems forecast::auto.arima() isn’t returning a model with a differencing parameter when it should. Read through my reproducible example to arrive at the question.

I have the following data:

`library(magrittr) library(dplyr) mydata <- c(305, 348, 337, 350, 368, 345, 370, 291, 337, 323, 328, 307, 299, 323, 292, 273, 282, 333, 325, 322, 298, 306, 339, 320, 348, 349, 381, 331, 373, 349, 307, 321, 347, 304, 314, 273, 309, 300, 266, 280, 318, 346, 399, 360, 394, 447, 420, 417, 341, 320, 292, 264, 264, 276, 292, 284, 219, 252)`

Which I then convert to a univariate time series object (and clean):

`global_ts <- ts(data=mydata, start=1960, end=2017, frequency=1) %>% forecast::tsclean(.)`

The data look like this:

`ggplot2::autoplot(global_ts) + ggplot2::theme_bw() + ggplot2::geom_line(size=0.6) + ggplot2::geom_point(shape = 21, colour = "black", fill = "dodgerblue", size = 3, stroke = 1) + ggplot2::labs(x="\nTime [years]") + ggplot2::theme(axis.text.x = ggplot2::element_text(size=12)) + ggplot2::theme(axis.text.y = ggplot2::element_text(size=12)) + ggplot2::theme(axis.title.x = ggplot2::element_text(size=18)) + ggplot2::theme(axis.title.y = ggplot2::element_text(size=18))`

These data are not stationary:

`tseries::adf.test(global_ts)`

The data show autocorrelation:

`acf(global_ts, lag.max = 20)`

The data show partial autocorrelation:

To stationarize the data, I decided to calculate the first difference:

`global_ts_difference_lag_1 = diff(global_ts, differences = 1)`

The first differences look like this:

`ggplot2::autoplot(global_ts_difference_lag_1) + ggplot2::theme_bw() + ggplot2::geom_line(size=0.6) + ggplot2::geom_point(shape = 21, colour = "black", fill = "dodgerblue", size = 3, stroke = 1) + ggplot2::labs(x="\nTime [years]") + ggplot2::theme(axis.text.x = ggplot2::element_text(size=12)) + ggplot2::theme(axis.text.y = ggplot2::element_text(size=12)) + ggplot2::theme(axis.title.x = ggplot2::element_text(size=18)) + ggplot2::theme(axis.title.y = ggplot2::element_text(size=18))`

The first order differenced data is stationary:

`tseries::adf.test(global_ts_difference_lag_1)`

The first order differenced data show no autocorrelation:

`acf(global_ts_difference_lag_1, lag.max = 20)`

The first order differenced data show no partial autocorrelation (note: it’s acceptable to have one line eclipse the 95% confidence intervals because 19/20 = 0.95):

`pacf(global_ts_difference_lag_1, lag.max = 20)`

I performed an ARIMA using forecast::auto.arima():

`forecast::auto.arima(global_ts, ic="aic", trace=TRUE, stepwise = FALSE)`

The forecast::auto.arima() function returned a non-differenced ARIMA, even though the data are clearly non-stationary without differencing…

If I forecast using ARIMA(1,0,0), I get the following:

`global_arima <- arima(global_ts, order=c(1,0,0), include.mean = TRUE) global_arima`

`plot(forecast::forecast(global_arima, h=11, level=95))`

NOW, If I specify first order differencing as an argument in the forecast::auto.arima() function, it returns a different model:

`forecast::auto.arima(global_ts, ic="aic", d=1, trace=TRUE, stepwise = FALSE)`

If I forecast using ARIMA(1,1,0), I get the following:

`global_arima <- arima(global_ts, order=c(1,1,0), include.mean = TRUE) global_arima`

`plot(forecast::forecast(global_arima, h=11, level=95))`

MY QUESTION IS THE FOLLOWING –

Why isn’t forecast::auto.arima() correctly performing a check for differencing?

The documentation for forecast::auto.arima() says the ‘d’ argument is the “order of first-differencing. If missing, will choose a value based on KPSS test.”

Is forecast::auto.arima() actually choosing a value for differencing (d) based on the KPSS test? It seems to not actually be doing this…

To cover my bases, I performed a manual KPSS test, which resulted in clear non-stationarity for the original time series:

`tseries::kpss.test(global_ts)`

What gives? Am I missing something? Which forecast should I trust?

Oh and I should also mention that I get strange results when using forecast::ndiffs(), which is supposed to tell the user the number of differences are required to achieve stationarity. The test performed seems to dictate the outcome…

`forecast::ndiffs(global_ts, test="kpss")`

`forecast::ndiffs(global_ts, test="adf")`

`forecast::ndiffs(global_ts, test="pp")`

Why would these tests give such wildly different results? Further, why would tseries::kpss.test() give different results than forecast::ndiffs()??

**Answer**

auto.arima is a brute force list-based procedure that tries a fixed set of models and selects the calculated AIC based upon estimated parameters. The AIC should be calculated from residuals using models that control for intervention administration, otherwise the intervention effects are taken to be Gaussian noise, underestimating the actual model’s autoregressive effect and thus miscalculates the model parameters which leads directly to an incorrect error sum of squares and ultimately an incorrect AIC. More importantly the only remedy it has for non-stationarity ( besides power transforms ) is to suggest differencing.

I took the 58 values and used AUTOBOX (my tool of choice .. since I helped to develop it ! )to automatically identify a possible arima model with any needed Interventions effects included. The model identified (arima portion) is remarkably similar to auto.arima with the exception that the forecast assymptotes to a much lower level. This is due to the detection and incorporation of a level shift (N.B. a level shift refelects that de-meaning the series is needed NOT differencing the series as the suggested cause of the non-stationarity ).

The equation is here with a level shift at period 48 .

The residuals from the model are here with acf here . The plot of the forecasts is here

Finally the reflection by @jbowman is well intended as it highlights the AR(1) as deficient due to the untreated anomalous structure in the residuals i.e. the level shift at period 48 . Note the permanent drop-off (level shift) at period 48 . Manual/visual detecting the need for a level shift in the model can sometimes be done in this manner by identifying structure in a tentative set of residuals.

In closing , the adf tests you cited have assumptions .. one of them is no pulses , level shifts , local time trends or seasonal pulses are needed. That is possibly why your are experiencing a conundrum . Test assumptions are very important .

IN answer to your question/comment …. The series is non-stationary BUT differencing is not needed.

**Attribution***Source : Link , Question Author : awags1 , Answer Author : IrishStat*