Suppose X_1, X_2,…,X_n be i.i.d from N(\mu,\sigma^2) with unknown \mu \in \mathcal R and \sigma^2>0

Let Z=\frac{X_1-\bar{X}}{S}, where S is the standard deviation here.

It can be shown that Z has the Lebesgue p.d.f.

f(z)=\frac{\sqrt{n}\, \Gamma\left(\frac{n-1}{2}\right)}{\sqrt{\pi}(n-1)\Gamma\left(\frac{n-2}{2}\right)}\left[1-\frac{nz^2}{(n-1)^2}\right]^{n/2-2}I_{(0,(n-1)/\sqrt{n})}(|z|)

My question is then how to get this pdf?The question is from here in example 3.3.4 to find the UMVUE of P(X_1 \le c). I can understand the logic and procedures to find the UMVUE but don’t know how to get the pdf.

I think this question also relate to this one.

Thank you very much for help or point to any related references will be also appropriated.

**Answer**

What is so intriguing about this result is how much it looks like the distribution of a correlation coefficient. There’s a reason.

Suppose (X,Y) is bivariate normal with zero correlation and common variance \sigma^2 for both variables. Draw an iid sample (x_1,y_1), \ldots, (x_n,y_n). It is well known, and readily established geometrically (as Fisher did a century ago) that the distribution of the sample correlation coefficient

r = \frac{\sum_{i=1}^n(x_i – \bar x)(y_i – \bar y)}{(n-1) S_x S_y}

is

f(r) = \frac{1}{B\left(\frac{1}{2}, \frac{n}{2}-1\right)}\left(1-r^2\right)^{n/2-2},\ -1 \le r \le 1.

(Here, as usual, \bar x and \bar y are sample means and S_x and S_y are the square roots of the unbiased variance estimators.) B is the Beta function, for which

\frac{1}{B\left(\frac{1}{2}, \frac{n}{2}-1\right)} = \frac{\Gamma\left(\frac{n-1}{2}\right)}{\Gamma\left(\frac{1}{2}\right)\Gamma\left(\frac{n}{2}-1\right)} = \frac{\Gamma\left(\frac{n-1}{2}\right)}{\sqrt{\pi}\Gamma\left(\frac{n}{2}-1\right)} . \tag{1}

To compute r, we may exploit its invariance under rotations in \mathbb{R}^n around the line generated by (1,1,\ldots, 1), along with the invariance of the distribution of the sample under the same rotations, and *choose y_i/S_y to be any unit vector whose components sum to zero.* One such vector is proportional to v = (n-1, -1, \ldots, -1). Its standard deviation is

S_v = \sqrt{\frac{1}{n-1}\left((n-1)^2 + (-1)^2 + \cdots + (-1)^2\right)} = \sqrt{n}.

Consequently, r must have the same distribution as

\frac{\sum_{i=1}^n(x_i – \bar x)(v_i – \bar v)}{(n-1) S_x S_v}

= \frac{(n-1)x_1 – x_2-\cdots-x_n}{(n-1) S_x \sqrt{n}}

= \frac{n(x_1 – \bar x)}{(n-1) S_x \sqrt{n}} = \frac{\sqrt{n}}{n-1}Z.

Therefore all we need to do is rescale r to find the distribution of Z:

f_Z(z) = \bigg|\frac{\sqrt{n}}{n-1}\bigg| f\left(\frac{\sqrt{n}}{n-1}z\right) = \frac{1}{B\left(\frac{1}{2}, \frac{n}{2}-1\right)} \frac{\sqrt{n}}{n-1}\left(1- \frac{n}{(n-1)^2}z^2\right)^{n/2-2}

for |z| \le \frac{n-1}{\sqrt{n}}. Formula (1) shows this is identical to that of the question.

Not entirely convinced? Here is the result of simulating this situation 100,000 times (with n=4, where the distribution is uniform).

The first histogram plots the correlation coefficients of (x_i,y_i),i=1,\ldots,4 while the second histogram plots the correlation coefficients of (x_i,v_i),i=1,\ldots,4) *for a randomly chosen vector v_i that remains fixed for all iterations.* They are both uniform. The QQ-plot on the right confirms these distributions are essentially identical.

Here’s the `R`

code that produced the plot.

```
n <- 4
n.sim <- 1e5
set.seed(17)
par(mfrow=c(1,3))
#
# Simulate spherical bivariate normal samples of size n each.
#
x <- matrix(rnorm(n.sim*n), n)
y <- matrix(rnorm(n.sim*n), n)
#
# Look at the distribution of the correlation of `x` and `y`.
#
sim <- sapply(1:n.sim, function(i) cor(x[,i], y[,i]))
hist(sim)
#
# Specify *any* fixed vector in place of `y`.
#
v <- c(n-1, rep(-1, n-1)) # The case in question
v <- rnorm(n) # Can use anything you want
#
# Look at the distribution of the correlation of `x` with `v`.
#
sim2 <- sapply(1:n.sim, function(i) cor(x[,i], v))
hist(sim2)
#
# Compare the two distributions.
#
qqplot(sim, sim2, main="QQ Plot")
```

### Reference

R. A. Fisher, Frequency-distribution of the values of the correlation coefficient in samples from an indefinitely large population. *Biometrika*, **10**, 507. See Section 3. (Quoted in *Kendall’s Advanced Theory of Statistics*, 5th Ed., section 16.24.)

**Attribution***Source : Link , Question Author : Deep North , Answer Author : kjetil b halvorsen*