Assume that X=X1+X2+⋯+Xn where Xi∼N(0,σ2) are independent.
My question is, what distribution does
Z=X2X21+X22+⋯+X2n
follow? I know from here that the ratio of two chi-squared random variables expressed as WW+Y follows a Beta distribution. I think that this assumes independence between W and Y. In my case though, the denominator of Z contains the components of X squared.
I think Z must also follow a variation of the Beta distribution but I am not sure. And if this assumption is correct, I don’t know how to prove it.
Answer
This post elaborates on the answers in the comments to the question.
Let X=(X1,X2,…,Xn). Fix any e1∈Rn of unit length. Such a vector may always be completed to an orthonormal basis (e1,e2,…,en) (by means of the Gram-Schmidt process, for instance). This change of basis (from the usual one) is orthogonal: it does not change lengths. Thus the distribution of
(e1⋅X)2||X||2=(e1⋅X)2X21+X22+⋯+X2n
does not depend on e1. Taking e1=(1,0,0,…,0) shows this has the same distribution as
X21X21+X22+⋯+X2n.
Since the Xi are iid Normal, they may be written as σ times iid standard Normal variables Y1,…,Yn and their squares are σ2 times Γ(1/2) distributions. Since the sum of n−1 independent Γ(1/2) distributions is Γ((n−1)/2), we have determined that the distribution of (1) is that of
σ2Uσ2U+σ2V=UU+V
where U=X21/σ2∼Γ(1/2) and V=(X22+⋯+X2n)/σ2∼Γ((n−1)/2) are independent. It is well known that this ratio has a Beta(1/2,(n−1)/2) distribution. (Also see the closely related thread at Distribution of XY if X∼ Beta(1,K−1) and Y∼ chi-squared with 2K degrees.)
Since X1+⋯+Xn=(1,1,…,1)⋅(X1,X2,⋯,Xn)=√ne1⋅X
for the unit vector e1=(1,1,…,1)/√n, we conclude that Z is (√n)2=n times a Beta(1/2,(n−1)/2) variate. For n≥2 it therefore has density function
fZ(z)=n1−n/2B(12,n−12)√(n−z)n−3z
on the interval (0,n) (and otherwise is zero).
As a check, I simulated 100,000 independent realizations of Z for σ=1 and n=2,3,10, plotted their histograms, and superimposed the graph of the corresponding Beta density (in red). The agreements are excellent.
Here is the R
code. It carries out the simulation by means of the formula sum(x)^2 / sum(x^2)
for Z, where x
is a vector of length n
generated by rnorm
. The rest is just looping (for
, apply
) and plotting (hist
, curve
).
for (n in c(2, 3, 10)) {
z <- apply(matrix(rnorm(n*1e5), nrow=n), 2, function(x) sum(x)^2 / sum(x^2))
hist(z, freq=FALSE, breaks=seq(0, n, length.out=50), main=paste("n =", n), xlab="Z")
curve(dbeta(x/n, 1/2, (n-1)/2)/n, add=TRUE, col="Red", lwd=2)
}
Attribution
Source : Link , Question Author : x0dros , Answer Author : Community