Prove that the maximum entropy distribution with a fixed covariance matrix is a Gaussian

I’m trying to get my head around the following proof that the Gaussian has maximum entropy.

How does the starred step make sense? A specific covariance only fixes the second moment. What happens to the third, fourth, fifth moments etc?

enter image description here

Answer

The starred step is valid because (a) p and q have the same zeroth and second moments and (b) log(p) is a polynomial function of the components of x whose terms have total degrees 0 or 2.


You need to know only two things about a multivariate normal distribution with zero mean:

  1. log(p) is a quadratic function of x=(x1,x2,,xn) with no linear terms. Specifically, there are constants C and pij for which log(p(x))=C+ni,j=1pijxixj.

    (Of course C and the pij can be written in terms of Σ, but this detail does not matter.)

  2. Σ gives the second moments of the distribution. That is, Σij=Ep(xixj)=p(x)xixjdx.

We may use this information to work out an integral:

(q(x)p(x))log(p(x))dx=(q(x)p(x))(C+ni,j=1pijxixj)dx.

It breaks into the sum of two parts:

  • (q(x)p(x))Cdx=C(q(x)dxp(x)dx)=C(11)=0, because both q and p are probability density functions.

  • (q(x)p(x))ni,j=1pijxixjdx=ni,j=1pij(q(x)p(x))xixjdx=0 because each of the integrals on the right hand side, q(x)xixjdx and p(x)xixjdx, has the same value (to wit, Σij). This is what the remark “yield the same moments of the quadratic form” is intended to say.

The result follows immediately: since (q(x)p(x))log(p(x))dx=0, we conclude that q(x)log(p(x))dx=p(x)log(p(x))dx.

Attribution
Source : Link , Question Author : Tarrare , Answer Author : whuber

Leave a Comment