Given n random variable Xi, with probability distribution P(X1,…,Xn), the correlation matrix Cij=E[XiXj]−E[Xi]E[Xj] is positive semidefinite, i.e. its eigenvalues are positive or zero.
I am interested in the conditions on P that are necessary and/or sufficient for C to have m zero eigenvalues. For instance, a sufficient condition is that the random variables are not independent : ∑iuiXi=0 for some real numbers ui. For example, if P(X1,…,Xn)=δ(X1−X2)p(X2,…,Xn), then →u=(1,−1,0,…,0) is an eigenvector of C with zero eigenvalue. If we have m independent linear constraints on the Xi‘s of this type, it would imply m zero eigenvalues.
There is at least one additional (but trivial) possibility, when Xa=E[Xa] for some a (i.e. P(X1,…,Xn)∝δ(Xa−E[Xa])), since in that case Cij has a column and a line of zeros : Cia=Cai=0,∀i. As it is not really interesting, I am assuming that the probability distribution is not of that form.
My question is : are linear constraints the only way to induce zero eigenvalues (if we forbid the trivial exception given above), or can nonlinear constraints on the random variables also generate zero eigenvalues of C ?
Answer
Perhaps by simplifying the notation we can bring out the essential ideas. It turns out we don’t need involve expectations or complicated formulas, because everything is purely algebraic.
The algebraic nature of the mathematical objects
The question concerns relationships between (1) the covariance matrix of a finite set of random variables X1,…,Xn and (2) linear relations among those variables, considered as vectors.
The vector space in question is the set of all finitevariance random variables (on any given probability space (Ω,P)) modulo the subspace of almost surely constant variables, denoted L2(Ω,P)/R. (That is, we consider two random variables X and Y to be the same vector when there is zero chance that X−Y differs from its expectation.) We are dealing only with the finitedimensional vector space V generated by the Xi, which is what makes this an algebraic problem rather than an analytic one.
What we need to know about variances
V is more than just a vector space: it is a quadratic module, because it comes equipped with the variance. All we need to know about variances are two things:

The variance is a scalarvalued function Q with the property that Q(aX)=a2Q(X) for all vectors X.

The variance is nondegenerate.
The second needs some explanation. Q determines a “dot product,” which is a symmetric bilinear form given by
X⋅Y=14(Q(X+Y)−Q(X−Y)).
(This is of course nothing other than the covariance of the variables X and Y.) Vectors X and Y are orthogonal when their dot product is 0. The orthogonal complement of any set of vectors A⊂V consists of all vectors orthogonal to every element of A, written
A0={v∈V∣a.v=0 for all v∈V}.
It is clearly a vector space. When V0={0}, Q is nondegenerate.
Allow me to prove that the variance is indeed nondegenerate, even though it might seem obvious. Suppose X is a nonzero element of V0. This means X⋅Y=0 for all Y∈V; equivalently,
Q(X+Y)=Q(X−Y)
for all vectors Y. Taking Y=X gives
4Q(X)=Q(2X)=Q(X+X)=Q(X−X)=Q(0)=0
and thus Q(X)=0. However, we know (using Chebyshev’s Inequality, perhaps) that the only random variables with zero variance are almost surely constant, which identifies them with the zero vector in V, QED.
Interpreting the questions
Returning to the questions, in the preceding notation the covariance matrix of the random variables is just a regular array of all their dot products,
T=(Xi⋅Xj).
There is a good way to think about T: it defines a linear transformation on Rn in the usual way, by sending any vector x=(x1,…,xn)∈Rn into the vector T(x)=y=(y1,…,xn) whose ith component is given by the matrix multiplication rule
yi=n∑j=1(Xi⋅Xj)xj.
The kernel of this linear transformation is the subspace it sends to zero:
Ker(T)={x∈Rn∣T(x)=0}.
The foregoing equation implies that when x∈Ker(T), for every i
0=yi=n∑j=1(Xi⋅Xj)xj=Xi⋅(∑jxjXj).
Since this is true for every i, it holds for all vectors spanned by the Xi: namely, V itself. Consequently, when x∈Ker(T), the vector given by ∑jxjXj lies in V0. Because the variance is nondegenerate, this means ∑jxjXj=0. That is, x describes a linear dependency among the n original random variables.
You can readily check that this chain of reasoning is reversible:
Linear dependencies among the Xj as vectors are in onetoone correspondence with elements of the kernel of T.
(Remember, this statement still considers the Xj as defined up to a constant shift in location–that is, as elements of L2(Ω,P)/R–rather than as just random variables.)
Finally, by definition, an eigenvalue of T is any scalar λ for which there exists a nonzero vector x with T(x)=λx. When λ=0 is an eigenvalue, the space of associated eigenvectors is (obviously) the kernel of T.
Summary
We have arrived at the answer to the questions: the set of linear dependencies of the random variables, qua elements of L2(Ω,P)/R, corresponds onetoone with the kernel of their covariance matrix T. This is so because the variance is a nondegenerate quadratic form. The kernel also is the eigenspace associated with the zero eigenvalue (or just the zero subspace when there is no zero eigenvalue).
Reference
I have largely adopted the notation and some of the language of Chapter IV in
JeanPierre Serre, A Course In Arithmetic. SpringerVerlag 1973.
Attribution
Source : Link , Question Author : Adam , Answer Author : Community