I was wrestling with stationarity in my head for a while… Is this how you think about it? Any comments or further thoughts will be appreciated.
Stationary process is the one which
generates time-series values such that
distribution mean and variance is kept
constant. Strictly speaking, this is
known as weak form of stationarity or
Weak form of stationarity is when the
time-series has constant mean and
variance throughout the time.
Let’s put it simple, practitioners say
that the stationary time-series is the
one with no trend – fluctuates around
the constant mean and has constant
Covariance between different lags is
constant, it doesn’t depend on
absolute location in time-series. For
example, the covariance between t and
t-1 (first order lag) should always be
the same (for the period from
1960-1970 same as for the period from
1965-1975 or any other period).
In non-stationary processes there is
no long-run mean to which the series
reverts; so we say that non-stationary
time series do not mean revert. In
that case, the variance depends on
absolute position in time-series and
variance goes to infinity as time goes
on. Technically speaking,
auto-correlations to not decay with
time, but in small samples they do
disappear – although slowly.
In stationary processes, shocks are
temporary and dissipate (lose energy)
over time. After a while, they do not
contribute to the new time-series
values. For example, something which
happened log time ago (long enough)
such as World War II, had an impact,
but, it the time-series today is the
same as if World War II never
happened, we would say that shock lost
its energy or dissipated. Stationarity
is especially important as many
classical econometric theories are
derived under the assumptions of
A strong form of stationarity is when
the distribution of a time-series is
exactly the same trough time. In other
words, the distribution of original
time-series is exactly same as lagged
time-series (by any number of lags) or
even sub-segments of the time-series.
For example, strong form also suggests
that the distribution should be the
same even for a sub-segments
1950-1960, 1960-1970 or even
overlapping periods such as 1950-1960
and 1950-1980. This form of
stationarity is called strong because
it doesn’t assume any distribution. It
only says the probability distribution
should be the same. In the case of
weak stationarity, we defined
distribution by its mean and variance.
We could do this simplification
because implicitly we assumed normal
distribution, and normal distribution
is fully defined by its mean and
variance or standard deviation. This
is nothing but saying that probability
measure of the sequence (within
time-series) is the same as that for
lagged/shifted sequence of values
within same time-series.
First of all, it is important to note that stationarity is a property of a process, not of a time series. You consider the ensemble of all time series generated by a process. If the statistical properties¹ of this ensemble (mean, variance, …) are constant over time, the process is called stationary. Strictly speaking, it is impossible to say whether a given time series was generated by a stationary process (however, with some assumptions, we can take a good guess).
More intuitively, stationarity means that there are no distinguished points in time for your process (influencing the statistical properties of your observation). Whether this applies to a given process depends crucially on what you consider as fixed or variable for your process, i.e., what is contained in your ensemble.
A typical cause of non-stationarity are time-dependent parameters – which allow to distinguish time points by the values of the parameters. Another cause are fixed initial conditions.
Consider the following examples:
The noise reaching my house from a single car passing at a given time is not a stationary process. E.g., the average amplitude² is highest when the car is directly next to my house.
The noise reaching my house from street traffic in general is a stationary process, if we ignore the time dependency of the traffic intensity (e.g., less traffic at night or on weekends). There are no distinguished points in time anymore. While there may be strong fluctuations of individual time series, these vanish when I consider the ensemble of all realisations of the process.
If I we include known impacts on traffic intensity, e.g., that there is less traffic at night, the process is non-stationary again: The average amplitude² varies with a daily rhythm. Every point in time is distinguished by the time of the day.
The position of a single peppercorn in a pot of boiling water is a stationary process (ignoring the loss of water due to evaporation). There are no distinguished points in time.
The position of a single peppercorn in a pot of boiling water dropped in the exact middle at t=0 is not a stationary process, as t=0 is a distinguished point in time. The average position of the peppercorn is always in the middle (assuming a symmetric pot without distinguished directions), but at t=ε (with ε small), we can be sure that the peppercorn is somewhere near the middle for every realisation of the process, while at a later time, it can also be closer to the border of the pot.
So, the distribution of positions changes over time. To give a specific example, the standard deviation grows. The distribution quickly converges to the respective distributions of the previous example and if we only take a look at this process for t>T with a sufficiently high T, we can neglect the non-stationarity and approximate it as a stationary process for all purposes – the impact of the initial condition has faded away.
¹ For practical purposes, this is sometimes reduced to the mean and the variance (weak stationarity), but I do not consider this helpful to understand the concept. Just ignore weak stationarity until you understood stationarity.
² Which is the mean of the volume, but the standard deviation of the actual sound signal (do not worry too much about this here).