# How to perform PCA for data of very high dimensionality?

To perform principal component analysis (PCA), you have to subtract the means of each column from the data, compute the correlation coefficient matrix and then find the eigenvectors and eigenvalues. Well, rather, this is what I did to implement it in Python, except it only works with small matrices because the method to find the correlation coefficient matrix (corrcoef) doesn’t let me use an array with high dimensionality. Since I have to use it for images, my current implementation doesn’t really help me.

I’ve read that it’s possible to just take your data matrix $D$ and compute $DD^\top/n$ instead of $D^\top D/n$, but that doesn’t work for me. Well, I’m not exactly sure I understand what it means, besides the fact that it’s supposed to be a $n \times n$ matrix instead of $p\times p$ (in my case $p\gg n$). I read up about those in the eigenfaces tutorials but none of them seemed to explain it in such a way I could really get it.

In short, is there a simple algorithmic description of this method so that I can follow it?