# Intuitive explanation/motivation of stationary distribution of a process

Often, in literature, authors have been interested in finding the stationary distribution of a time-series process. For example, consider the following simple AR($$11$$) process $${Xt}\{X_t\}$$: $$Xt=αXt−1+et,X_t = \alpha X_{t-1} + e_t,$$ where $$etiid∼fe_t\stackrel{iid}{\thicksim} f$$.

What could possibly be the motivation(s) for finding the stationary distribution of any stochastic process?

What other (theoretical and practical) analyses one would possibly do using the resulting stationary distribution?

What is(are) the problem(s) if the stationary distribution does not exist? Will the process become useless?

What if the stationary distribution exists but it does not have a closed form? What are the disadvantages of not having a closed form expression of the same?

There are various motivations for interest in stationary distributions in this context, but probably the most important aspect is that they are closely related to limiting distributions. For most time-series processes, there is a close connection between the stationary distribution and limiting distribution of the process. Under very broad conditions, time-series processes based on IID error terms have a stationary distribution, and they converge to this stationary distribution as a limiting distribution for any starting distribution you specify. That means that if you let the process run for a long time, its distribution will be close to the stationary distribution regardless of how it started off. Thus, if you have reason to believe that the process has been running for a long time, you can reasonably assume it follows its stationary distribution.

In your question you use the example of an AR($1$) time-series process with IID error terms with an arbitrary marginal distribution. If $|\alpha|<1$ then this model is a recurrent time-homogeneous Markov chain and its stationary distribution can be found by inverting it to an MA($\infty$) process:

We can see that the process is a weighted sum of an infinite chain of IID error terms, where the weightings are exponentially decaying. The limiting distribution can be obtained from the error distribution $f$ by an appropriate convolution for this weighted sum. In general, this will depend on the form of $f$ and it may be a complicated distribution. However, it is worth noting that if the error distribution is not heavy-tailed, and if $\alpha \approx 1$ so that the decay is slow, then the limiting distribution will be close to a normal distribution, owing to approximation by the central limit theorem.

Practical applications: In most applications of the AR($1$) time-series process we assume a normal error distribution $e_t \sim \text{IID N}(0, \sigma^2)$, which means that the stationary distribution of the process is:

Regardless of the starting distribution for the process, this stationary distribution is the limiting distribution of the process. If we have reason to believe that the process has been running for a reasonable amount of time then we know that the process will have converged close to this limiting distribution, so it makes sense to assume that the process follows this distribution. Of course, as with any application of statistical modelling, we look at diagnostic plots/tests to see if the data falsify our assumed model form. Nevertheless, this form fits a broad class of cases where the AR($1$) model is used.

What if a stationary distribution does not exist: There are certain time-series processes where the stationary distribution does not exist. This is most common when there is some fixed periodic aspect to the series, or some absorbing state (or other non-communicating classes of states). In this case there may not be a limiting distribution, or the limiting distribution might be a marginal distribution that is aggregated across multiple non-communicating classes, which is not all that useful. This is not inherently a problem - it just means you need a different kind of model that correctly represents the non-stationary nature of the process. This is more complicated, but statistical theory has ways and means of dealing with this.