This may be a basic question, but I was wondering why an R value in a regression model can simply be squared to give a figure of explained variance?
I understand that R coefficient can give the strength of a relationship, but I don’t understand how simply squaring this value gives a measure of explained variance.
Any easy explanation of this?
Thanks very much for helping with this!
Hand-wavingly, the correlation R can be thought of as a measure of the angle between two vectors, the dependent vector Y and the independent vector X.
If the angle between the vectors is θ, the correlation R is cos(θ).
The part of Y that is explained by X is of length ||Y||cos(θ) and is parallel to X (or the projection of Y on X). The part that is not explained is of length ||Y||sin(θ) and is orthogonal to X. In terms of variances, we have
where the first term on the right is the explained variance and the second the unexplained variance. The fraction that is explained is thus R2, not R.