MLE convergence errors with statespace SARIMAX

I am trying to fit a year-long half-hourly dataset into Python statsmodels’ SARIMAX. I plotted the correlogram and it indicates AR and MA terms well above 5 each, and periodic terms (daily) of around 2. But even going more than order 2 in the AR or MA terms of either seasonal or nonseasonal type gives an MLE convergence error.

C:\Users\<user>\AppData\Local\Continuum\miniconda3\lib\site-packages\statsmodels\base\ ConvergenceWarning: Maximum Likelihood optimization failed to converge. 
Check mle_retvals
  "Check mle_retvals", ConvergenceWarning)

I get that it’s a nonlinear model and that it fails to converge, but I am at a loss as to how to proceed. mle_retvals isn’t a property of the SARIMAXResults object. It takes more than an hour to fit the entire dataset, which is why I am forced to use only a month or two of data when I’m iterating. I suspect intuitively that this reduces the potential order of terms that can be accommodated by the SARIMAX fitting without convergence issues, though I don’t have an in-depth understanding of the concepts here.

Other possibly important details are that many values are missing throughout the dataset, and I’m relying on the statespace form to impute these missing values.

How can I solve the convergence errors I’m getting with SARIMAX? What aspects should I check that could be causing the issues? Are these issues usually fixable?

Update: I don’t know what happened now, but I am getting an mle_retvals object

{'Hinv': array([[1, 0, 0, 0, 0, 0],
        [0, 1, 0, 0, 0, 0],
        [0, 0, 1, 0, 0, 0],
        [0, 0, 0, 1, 0, 0],
        [0, 0, 0, 0, 1, 0],
        [0, 0, 0, 0, 0, 1]]),
 'converged': False,
 'fcalls': 44,
 'fopt': 0.22864169058350794,
 'gcalls': 33,
 'gopt': array([ 0.33676031, -0.35390488, -0.01763243, -0.09141768, -0.12037386,
 'warnflag': 2}

This was for order=(2,0,2) and seasonal_order=(1,0,0,48). And I also now notice that there was a warning before the ConvergenceWarning

Warning: Desired error not necessarily achieved due to precision loss.

So the Hessian is singular; doesn’t this sound like a data problem? probably this means that the data is of lower order than the model? But that isn’t reflected in the ACF and PACF of the interpolated residual series of the (1,0,2),(1,0,0,48) model, where there are plenty of peaks above significance level (see PACF below). I used BFGS algorithm for this one.

PACF showing significant peaks in residuals

Interpolating the data to remove the missing values makes no difference!

Changing algorithm : Nelder-Mead seemed to required a lot of iterations, but finally converging for (2,0,2),(1,0,0,48), see PACF below

PACF showing significant peaks in residuals


First, mle_retvals should be an attribute of SARIMAXResults if it is constructed using a fit call, so you should be able to check it. What do you get when you try print(res.mle_retvals)?

Second, do the estimated parameters seem “in the ballpark”, or are they nonsense? Or are they NaN?

Without knowing more: you might try increasing the maximum number of iterations, e.g.

mod = sm.tsa.SARIMAX(endog, order=(p,d,q))
res =

You also might try a different optimization routine (e.g. nm for Nelder-Mead):

res =, method='nm')

You could try computing better starting parameters somehow. The default routine may not be good for hourly data and may not be good with lots of missing data, e.g.:

params = [...]
res =

Finally, if nothing else works, to see if it is the missing data per se that is the problem, you could try imputing the data (e.g. with Pandas interpolate or something) and see if the model converges then (the estimated parameters wouldn’t apply to the original dataset, but at least you could tell if that was the problem).


I guess the likelihood function may be quite flat, and/or has many local minima. That could also explain why powell (which is a derivative-free method) helps.

I think it is likely that the basic SARIMAX model is too “blunt” for such high frequency data. For example, there are possibilities of very long seasonal patterns (e.g weekly) that are computationally burdensome under the basic model, and there can be calendarity effects that must be taken into account. I think these features would make the autocorrelations difficult to interpret, and could make it appear that large numbers of lags are required to fit the model well.

Unfortunately, the best way to proceed can probably only be determined by looking at and thinking about the data. You may want to start with simple, low-order SARIMAX models, and take a look at the residuals and associated diagnostics (e.g. res.plot_diagnostics() for a start).

You could also try using a filter (e.g. seasonal_decompose or bk_filter) to remove cyclic effects at various frequencies prior to fitting the SARIMAX model. You could also try using the Census Bureau’s X13 tool.

Source : Link , Question Author : Milind R , Answer Author : cfulton

Leave a Comment