When studying two independent samples means, we are told we are looking at the “difference of two means”. This means we take the mean from population 1 (ˉy1) and subtract from it the mean from population 2 (ˉy2). So, our “difference of two means” is (ˉy1 – ˉy2).

When studying paired samples means, we are told we are looking at the “mean difference”, ˉd. This is calculated by taking the difference between each pair, and then taking the mean of all those differences.

My question is: Do we get the same (ˉy1 – ˉy2) versus its ˉd if we calculated them from two columns of data, and the first time considered it two independent samples, and the second time considered it paired data? I have played around with two columns of data, and it seems that the values are the same! In that case, can it be said that the different names are used for just non-quantitative reasons?

**Answer**

(I’m assuming you mean “sample” and not “population” in your first paragraph.)

The equivalence is easy to show mathematically. Start with two samples of equal size, {x1,…,xn} and {y1,…,yn}. Then define ˉx=1nn∑i=1xiˉy=1nn∑i=1yiˉd=1nn∑i=1xi−yi

Then you have: ˉx−ˉy=(1nn∑i=1xi)−(1nn∑i=1yi)=1n(n∑i=1xi−n∑i=1yi)=1n((x1+⋯+xn)−(y1+⋯+yn))=1n(x1+⋯+xn−y1−⋯−yn)=1n(x1−y1+⋯+xn−yn)=1n((x1−y1)+⋯+(xn−yn))=1nn∑i=1xi−yi=ˉd.

**Attribution***Source : Link , Question Author : user84756 , Answer Author : shadowtalker*