Why does Covariance measure only Linear dependence?

1) What is meant by linear dependence?

2) How can I convince myself that covariance measures linear dependence?

3) How I can convince myself that non-linear dependence is not measured by covariance?


A1) Say two variables X and Y are linearly dependent, then X=αY+c for some α,cR.

A2) The formula for covariance is:


From A1, consider some linear relationship X=αY+c, but all we have is the data from individual points in each variable. How do we get the value of α? Well, it turns out we can instead ask the question, “how do we draw a line between these points so as to minimise the sum of squared differences between each point and the line?”. And when we do this analysis for two variables, we get a closed form equation that looks like this:


Please note that the numerator is the covariance. I.e.


Correlation (e.g. Pearson) is often a measure of the covariance normalised against something to give it a comparable value. So you see the entire measure precedes from the analysis of how to fit a line to some data.

A3) Covariance doesn’t measure non-linear relationships for the exact same reason it measures linear ones. Namely, that you can basically think of it as the slope in a linear equation (e.g. X=αY+c), so when you try and fit a line to a curve, the sum of square differences between the points and the line may be large. Here is a good diagram illustrating the implications. The numbers indicate Pearson’s correlation coefficient, whilst the diagrams show the corresponding scatter plots.

enter image description here

Source : Link , Question Author : ColorStatistics , Answer Author : JP Zhang

Leave a Comment