Intuitive explanation of contribution to sum of two normally distributed random variables

If I have two normally distributed independent random variables X and Y with means μX and μY and standard deviations σX and σY and I discover that X+Y=c, then (assuming I have not made any errors) the conditional distribution of X and Y given c are also normally distributed with means
μX|c=μX+(cμXμY)σ2Xσ2X+σ2Y μY|c=μY+(cμXμY)σ2Yσ2X+σ2Y
and standard deviation

It is no surprise that the conditional standard deviations are the same as, given c, if one goes up the other must come down by the same amount. It is interesting that the conditional standard deviation does not depend on c.

What I cannot get my head round are the conditional means, where they take a share of the excess (cμXμY) proportional to the original variances, not to the original standard deviations.

For example, if they have zero means, μX=μY=0, and standard deviations σX=3 and σY=1 then conditioned on c=4 we would have E[X|c=4]=3.6 and E[Y|c=4]=0.4, i.e. in the ratio 9:1 even though I would have intuitively thought that the ratio 3:1 would be more natural. Can anyone give an intuitive explanation for this?

This was provoked by a Math.SE question


The question readily reduces to the case μX=μY=0 by looking at XμX and YμY.

Clearly the conditional distributions are Normal. Thus, the mean, median, and mode of each are coincident. The modes will occur at the coordinates of a local maximum of the bivariate PDF of X and Y constrained to the curve g(x,y)=x+y=c. This implies the contour of the bivariate PDF at this location and the constraint curve have parallel tangents. (This is the theory of Lagrange multipliers.) Because the equation of any contour is of the form f(x,y)=x2/(2σ2X)+y2/(2σ2Y)=ρ for some constant ρ (that is, all contours are ellipses), their gradients must be parallel, whence there exists λ such that


enter image description here

It follows immediately that the modes of the conditional distributions (and therefore also the means) are determined by the ratio of the variances, not of the SDs.

This analysis works for correlated X and Y as well and it applies to any linear constraints, not just the sum.

Source : Link , Question Author : Henry , Answer Author : whuber

Leave a Comment