Let (Xn) be a sequence of i.i.d N(0,1) random variables. Define S0=0 and Sn=∑nk=1Xk for n≥1. Find the limiting distribution of 1nn∑k=1|Sk−1|(X2k−1)

This problem is from a problem book on Probability Theory, in the chapter on the Central Limit Theorem.

Since Sk−1 and Xk are independent, E(|Sk−1|(X2k−1))=0 and V(|Sk−1|(X2k−1))=E(S2k−1(X2k−1)2)=E(S2k−1)E(X2k−1)2)=2(k−1)

Note that the |Sk−1|(X2k−1) are clearly not independent. The problem is from Shiryaev’s

Problems in Probability, which is itself based on the textbook from the same author. The textbook does not seem to cover the CLT for correlated variables. I don’t know if there’s a stationary, mixing sequence hiding somewhere…I have run simulations to get a feel of the answer

`import numpy as np import scipy as sc import scipy.stats as stats import matplotlib.pyplot as plt n = 20000 #summation index m = 2000 #number of samples X = np.random.normal(size=(m,n)) sums = np.cumsum(X, axis=1) sums = np.delete(sums, -1, 1) prods = np.delete(X**2-1, 0, 1)*np.abs(sums) samples = 1/n*np.sum(prods, axis=1) plt.hist(samples, bins=100, density=True) x = np.linspace(-6, 6, 100) plt.plot(x, stats.norm.pdf(x, 0, 1/np.sqrt(2*np.pi))) plt.show()`

Below is a histogram of 2000 samples (n=20.000). It looks fairly normally distributed…

**Answer**

When I simulate the distribution then I get something that resembles a Laplace distribution. Even better seems to be a q-Gausian (the exact parameters you would have to find using theory).

I guess that your book must contain some variation of the CLT that relates to that (q-generalised central limit theorem, probably it is in Section 7.6 *The central limit theorem for sums of dependent variables*, but I can’t look it up as I do not have the book available).

```
library(qGaussian)
set.seed(1)
Qstore <- c(0) # vector to store result
n <- 10^6 # columns X_i
m <- 10^2 # rows repetitions
pb <- txtProgressBar(title = "progress bar", min = 0,
max = 100, style=3)
for (i in 1:100) {
# doing this several times because this matrix method takes a lot of memory
# with smaller numbers n*m it can be done at once
X <- matrix(rnorm(n*m,0,1),m)
S <- t(sapply(1:m, FUN = function(x) cumsum(X[x,])))
S <- cbind(rep(0,m),S[,-n])
R <- abs(S)*(X^2-1)
Q <- t(sapply(1:m, FUN = function(x) cumsum(R[x,])))
Qstore <- c(Qstore,t(Q[,n]))
setTxtProgressBar(pb, i)
}
close(pb)
# compute histogram
x <- seq(floor(min(Qstore/n)), ceiling(max(Qstore/n)), 0.2)
h <- hist(Qstore/(n),breaks = x)
# plot simulation
plot( h$mid, h$density, log = "y", xlim=c(-7,7),
ylab = "log density" , xlab = expression(over(1,n)*sum(abs(S[k-1])*(X[k]^2-1),k==1,n) ) )
# distributions for comparison
lines(x, dnorm(x,0,1), col=1, lty=3) #normal
lines(x, dexp(abs(x),sqrt(2))/2, col=1, lty=2) #laplace
lines(x, qGaussian::dqgauss(x,sqrt(2),0,1/sqrt(2)), col=1, lty=1) #qgauss
# further plotting
title("10^4 repetitions with n=10^6")
legend(-7,0.6,c("Gaussian", "Laplace", "Q-Gaussian"),col=1, lty=c(3,2,1),cex=0.8)
```

**Attribution***Source : Link , Question Author : Gabriel Romon , Answer Author : Sextus Empiricus*