I’m currently going through a slide set I have for “factor analysis” (PCA as far as I can tell).
In it, the “fundamental theorem of factor analysis” is derived which claims that the correlation matrix of the data going into the analysis (R) can be recovered using the matrix of factor loadings (A):
This however confuses me. In PCA the matrix of “factor loadings” is given by the matrix of eigenvectors of the covariance/correlation matrix of the data (since we’re assuming that the data have been standardized, they are the same), with each eigenvector scaled to have length one. This matrix is orthogonal, thus AA⊤=I which is in general not equal to R.
This is a reasonable question (+1) that stems from the terminological ambiguity and confusion.
In the context of PCA, people often call principal axes (eigenvectors of the covariance/correlation matrix) “loadings”. This is sloppy terminology. What should rather be called “loadings” in PCA, are principal axes scaled by the square roots of the respective eigenvalues. Then the theorem you are referring to will hold.
Indeed, if the eigen-decomposition of the correlation matrix is R=VSV⊤ where V are eigenvectors (principal axes) and S is a diagonal matrix of eigenvalues, and if we define loadings as A=VS1/2, then one can easily see that R=AA⊤. Moreover, the best rank-r approximation to the correlation matrix is given by the first r PCA loadings: R≈ArA⊤r.
Please see my answer here for more about reconstructing covariance matrices with factor analysis and PCA loadings.