Prove the equivalence of the following two formulas for Spearman correlation

From wikipedia, Spearman’s rank correlation is calculated by converting variables $X_i$ and $Y_i$ into ranked variables $x_i$ and $y_i$, and then calculating Pearson’s correlation between the ranked variables:

Calculate Spearman via wikipedia

However, the article goes on to state that if there are no ties amongst the variables $X_i$ and $Y_i$, the above formula is equivalent to

second formula to calculate Spearman

where $d_i = y_i – x_i$, the difference in ranks.

Can someone give a proof of this please? I don’t have access to the textbooks referenced by the wikipedia article.

Answer

$ \rho = \frac{\sum_i(x_i-\bar{x})(y_i-\bar{y})}{\sqrt{\sum_i (x_i-\bar{x})^2 \sum_i(y_i-\bar{y})^2}}$

Since there are no ties, the $x$’s and $y$’s both consist of the integers from $1$ to $n$ inclusive.

Hence we can rewrite the denominator:

$\frac{\sum_i(x_i-\bar{x})(y_i-\bar{y})}{\sum_i (x_i-\bar{x})^2}$

But the denominator is just a function of $n$:

$\sum_i (x_i-\bar{x})^2 = \sum_i x_i^2 – n\bar{x}^2 \\
\quad= \frac{n(n + 1)(2n + 1)}{6} – n(\frac{(n + 1)}{2})^2\\
\quad= n(n + 1)(\frac{(2n + 1)}{6} – \frac{(n + 1)}{4})\\
\quad= n(n + 1)(\frac{(8n + 4-6n-6)}{24})\\
\quad= n(n + 1)(\frac{(n -1)}{12})\\
\quad= \frac{n(n^2 – 1)}{12}$

Now let’s look at the numerator:

$\sum_i(x_i-\bar{x})(y_i-\bar{y})\\
\quad=\sum_i x_i(y_i-\bar{y})-\sum_i\bar{x}(y_i-\bar{y}) \\
\quad=\sum_i x_i y_i-\bar{y}\sum_i x_i-\bar{x}\sum_iy_i+n\bar{x}\bar{y} \\
\quad=\sum_i x_i y_i-n\bar{x}\bar{y} \\
\quad= \sum_i x_i y_i-n(\frac{n+1}{2})^2 \\
\quad= \sum_i x_i y_i- \frac{n(n+1)}{12}3(n +1) \\
\quad= \frac{n(n+1)}{12}.(-3(n +1))+\sum_i x_i y_i \\
\quad= \frac{n(n+1)}{12}.[(n-1) – (4n+2)] + \sum_i x_i y_i \\
\quad= \frac{n(n+1)(n-1)}{12} – n(n+1)(2n+1)/6 + \sum_i x_i y_i \\
\quad= \frac{n(n+1)(n-1)}{12} -\sum_i x_i^2+ \sum_i x_i y_i \\
\quad= \frac{n(n+1)(n-1)}{12} -\sum_i (x_i^2+ y_i^2)/2+ \sum_i x_i y_i \\
\quad= \frac{n(n+1)(n-1)}{12} – \sum_i (x_i^2 – 2x_i y_i + y_i^2) /2\\
\quad= \frac{n(n+1)(n-1)}{12} – \sum_i(x_i – y_i)^2/2\\
\quad= \frac{n(n^2-1)}{12} – \sum d_i^2/2$

Numerator/Denominator

$= \frac{n(n+1)(n-1)/12 – \sum d_i^2/2}{n(n^2 – 1)/12}\\
\quad= {\frac {n(n^2 – 1)/12 -\sum d_i^2/2}{n(n^2 – 1)/12}}\\
\quad= 1- {\frac {6 \sum d_i^2}{n(n^2 – 1)}}\,$.

Hence

$ \rho = 1- {\frac {6 \sum d_i^2}{n(n^2 – 1)}}.$

Attribution
Source : Link , Question Author : Alex , Answer Author : Glen_b

Leave a Comment