I want to test a sample correlation $r$ for significance, using p-values, that is

$H_0: \rho = 0, \; H_1: \rho \neq 0.$

I have understood that I can use Fisher’s z-transform to calculate this by

$z_{obs}= \displaystyle\frac{\sqrt{n-3}}{2}\ln\left(\displaystyle\frac{1+r}{1-r}\right)$

and finding the p-value by

$p = 2P\left(Z>z_{obs}\right)$

using the standard normal distribution.

My question is: how large $n$ should be for this to be an appropriate transformation? Obviously, $n$ must be larger than 3. My textbook does not mention any restrictions, but on slide 29 of this presentation it says that $n$ must be larger than 10. For the data I will be considering, I will have something like $5 \leq n \leq 10$.

**Answer**

For questions like these I would just run a simulation and see if the $p$-values behave as I expect them to. The $p$-value is the probability of randomly drawing a sample that deviates at least as much from the null-hypothesis as the data you observed if the null-hypothesis is true. So if we had many such samples, and one of them had a $p$-value of .04 then we would expect 4% of those samples to have a value less than .04. The same is true for all other possible $p$-values.

Below is a simulation in Stata. The graphs check whether the $p$-values measure what they are supposed to measure, that is, they shows how much the proportion of samples with $p$-values less than the nominal $p$-value deviates from the nominal $p$-value. As you can see that test is somewhat problematic with such small number of observations. Whether or not it is too problematic for your research is your judgement call.

```
clear all
set more off
program define sim, rclass
tempname z se
foreach i of numlist 5/10 20(10)50 {
drop _all
set obs `i'
gen x = rnormal()
gen y = rnormal()
corr x y
scalar `z' = atanh(r(rho))
scalar `se' = 1/sqrt(r(N)-3)
return scalar p`i' = 2*normal(-abs(`z'/`se'))
}
end
simulate p5 =r(p5) p6 =r(p6) p7 =r(p7) ///
p8 =r(p8) p9 =r(p9) p10 =r(p10) ///
p20=r(p20) p30=r(p30) p40 =r(p40) ///
p50=r(p50), reps(200000) nodots: sim
simpplot p5 p6 p7 p8 p9 p10, name(small, replace) ///
scheme(s2color) ylabel(,angle(horizontal))
```

```
simpplot p20 p30 p40 p50 , name(less_small, replace) ///
scheme(s2color) ylabel(,angle(horizontal))
```

**Attribution***Source : Link , Question Author : Gunnhild , Answer Author : Maarten Buis*