Quantifying how much “more correlation” a correlation matrix A contains compared to a correlation matrix B

I have 2 correlation matrices $A$ and $B$ (using the Pearson’s linear correlation coefficient through Matlab’s corrcoef()). I would like to quantify how much “more correlation” $A$ contains compared to $B$. Is there any standard metric or test for that?

E.g. the correlation matrix

enter image description here

contains “more correlation” than

enter image description here

I am aware of the Box’s M Test, which is used to determine whether two or more covariance matrices are equal (and can be used for correlation matrices as well since the latter are the same as the covariance matrices of standardized random variables).

Right now I am comparing $A$ and $B$ via the mean of the absolute values of their non-diagonal elements, i.e. $\frac{2}{n^2-n}\sum_{1 \leq i < j \leq n } \left | x_{i, j} \right |$. (I use the symmetry of the correlation matrix in this formula). I guess that there might be some cleverer metrics.


Following Andy W’s comment on the matrix determinant, I ran an experiment to compare the metrics:

  • Mean of the absolute values of their non-diagonal elements: $\text{metric}_\text{mean}()$
  • Matrix determinant: $\text{metric}_\text{determinant}()$:

Let $A$ and $B$ two random symmetric matrix with ones on the diagonal of dimension $10 \times 10$. The upper triangle (diagonal excluded) of $A$ is populated with random floats from 0 to 1. The upper triangle (diagonal excluded) of $B$ is populated with random floats from 0 to 0.9. I generate 10000 such matrices and do some counting:

  • $\text{metric}_\text{mean}(B) \leq \text{metric}_\text{mean}(A) $ 80.75% of the time
  • $\text{metric}_\text{determinant}(B) \leq \text{metric}_\text{determinant}(A)$ 63.01% of the time

Given the result I would tend to think that $\text{metric}_\text{mean}(B)$ is a better metric.

Matlab code:

function [  ] = correlation_metric(  )
%CORRELATION_METRIC Test some metric for
%   http://stats.stackexchange.com/q/110416/12359 :
%   I have 2 correlation matrices A and B (using the Pearson's linear 
%   correlation coefficient through Matlab's corrcoef()).
%   I would like to quantify how much "more correlation"
%   A contains compared to B. Is there any standard metric or test for that?

% Experiments' parameters
runs = 10000;
matrix_dimension = 10;

%% Experiment 1
results = zeros(runs, 3);
for i=1:runs
    dimension = matrix_dimension;
    M = generate_random_symmetric_matrix( dimension, 0.0, 1.0 );
    results(i, 1) = abs(det(M));
%     results(i, 2) = mean(triu(M, 1));
    results(i, 2) = mean2(M);
%     results(i, 3) = results(i, 2) < results(i, 2) ; 
end
mean(results(:, 1))
mean(results(:, 2))


%% Experiment 2
results = zeros(runs, 6);
for i=1:runs
    dimension = matrix_dimension;
    M = generate_random_symmetric_matrix( dimension, 0.0, 1.0 );
    results(i, 1) = abs(det(M));
    results(i, 2) = mean2(M);
    M = generate_random_symmetric_matrix( dimension, 0.0, 0.9 );
    results(i, 3) = abs(det(M));
    results(i, 4) = mean2(M);
    results(i, 5) = results(i, 1) > results(i, 3);
    results(i, 6) = results(i, 2) > results(i, 4);
end

mean(results(:, 5))
mean(results(:, 6))
boxplot(results(:, 1))
figure
boxplot(results(:, 2))


end

function [ random_symmetric_matrix ] = generate_random_symmetric_matrix( dimension, minimum, maximum )
% Based on http://www.mathworks.com/matlabcentral/answers/123643-how-to-create-a-symmetric-random-matrix
d = ones(dimension, 1); %rand(dimension,1); % The diagonal values
t = triu((maximum-minimum)*rand(dimension)+minimum,1); % The upper trianglar random values
random_symmetric_matrix = diag(d)+t+t.'; % Put them together in a symmetric matrix
end

Example of a generated $10 \times 10$ random symmetric matrix with ones on the diagonal:

>> random_symmetric_matrix

random_symmetric_matrix =

    1.0000    0.3984    0.1375    0.4372    0.2909    0.6172    0.2105    0.1737    0.2271    0.2219
    0.3984    1.0000    0.3836    0.1954    0.5077    0.4233    0.0936    0.2957    0.5256    0.6622
    0.1375    0.3836    1.0000    0.1517    0.9585    0.8102    0.6078    0.8669    0.5290    0.7665
    0.4372    0.1954    0.1517    1.0000    0.9531    0.2349    0.6232    0.6684    0.8945    0.2290
    0.2909    0.5077    0.9585    0.9531    1.0000    0.3058    0.0330    0.0174    0.9649    0.5313
    0.6172    0.4233    0.8102    0.2349    0.3058    1.0000    0.7483    0.2014    0.2164    0.2079
    0.2105    0.0936    0.6078    0.6232    0.0330    0.7483    1.0000    0.5814    0.8470    0.6858
    0.1737    0.2957    0.8669    0.6684    0.0174    0.2014    0.5814    1.0000    0.9223    0.0760
    0.2271    0.5256    0.5290    0.8945    0.9649    0.2164    0.8470    0.9223    1.0000    0.5758
    0.2219    0.6622    0.7665    0.2290    0.5313    0.2079    0.6858    0.0760    0.5758    1.0000

Answer

The determinant of the covariance isn’t a terrible idea, but you probably want to use the inverse of the determinant. Picture the contours (lines of equal probability density) of a bivariate distribution. You can think of the determinant as (approximately) measuring the volume of a given contour. Then a highly correlated set of variables actually has less volume, because the contours are so stretched.

For example:
If $X \sim N(0, 1)$ and $Y = X + \epsilon$, where $\epsilon \sim N(0, .01)$, then
$$
Cov (X, Y) = \begin{bmatrix}
1 & 1 \\
1 & 1.01
\end{bmatrix}
$$
so
$$
Corr (X, Y) \approx \begin{bmatrix}
1 & .995 \\
.995 & 1
\end{bmatrix}
$$
so the determinant is $\approx .0099$. On the other hand, if $X, Y$ are independent $N(0, 1)$, then the determinant is 1.

As any pair of variables becomes more nearly linearly dependent, the determinant approaches zero, since it’s the product of the eigenvalues of the correlation matrix. So the determinant may not be able to distinguish between a single pair of nearly-dependent variables, as opposed to many pairs, and this is unlikely to be a behavior you desire. I would suggest simulating such a scenario. You could use a scheme like this:

  1. Fix a dimension P, an approximate rank r, and let s be a large constant
  2. Let A[1], …, A[r] be random vectors, drawn iid from N(0, s) distribution
  3. Set Sigma = Identity(P)
  4. For i=1..r: Sigma = Sigma + A[i] * A[i]^T
  5. Set rho to be Sigma scaled as a correlation matrix

Then rho will have approximate rank r, which determines how many nearly linearly independent variables you have. You can see how the determinant reflects the approximate rank r and scaling s.

Attribution
Source : Link , Question Author : Franck Dernoncourt , Answer Author : Andrew M

Leave a Comment