Cholesky versus eigendecomposition for drawing samples from a multivariate normal distribution

I would like to draw a sample $\mathbf{x} \sim N\left(\mathbf{0}, \mathbf{\Sigma} \right)$. Wikipedia suggests either using a Cholesky or Eigendecomposition, i.e.
$
\mathbf{\Sigma} = \mathbf{D}_1\mathbf{D}_1^T
$
or
$
\mathbf{\Sigma} = \mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^T
$

And hence the sample can be drawn via:
$
\mathbf{x} = \mathbf{D}_1 \mathbf{v}
$
or
$
\mathbf{x} = \mathbf{Q}\sqrt{\mathbf{\Lambda}} \mathbf{v}
$
where
$
\mathbf{v} \sim N\left(\mathbf{0}, \mathbf{I} \right)
$

Wikipedia suggests that they are both equally good for generating samples, but the Cholesky method has the faster computation time. Is this true? Especially numerically when using a monte-carlo method, where the variances along the diagonals may differ by several orders of magnitude? Is there any formal analysis on this problem?

Answer

The problem was studied by by Straka et.al for the Unscented Kalman Filter which draws (deterministic) samples from a multivariate Normal distribution as part of the algorithm. With some luck, the results might be applicable to the monte-carlo problem.

The Cholesky Decomposition (CD) and the Eigen Decomposition (ED) – and for that matter the actual Matrix Square Root (MSR) are all ways in which a positive semi-definite matrix (PSD) can be broken down.

Consider the SVD of a PSD matrix, $P = USV^T$. Since P is PSD, this is actually the same as the ED with $P = USU^T$. Moreover, we can split the diagonal matrix by its square root: $P = U\sqrt{S}\sqrt{S}^TU^T$, noting that $\sqrt{S} = \sqrt{S}^T$.

We may now introduce an arbitrary orthogonal matrix $O$:

$P = U\sqrt{S}OO^T\sqrt{S}^TU^T = (U\sqrt{S}O)(U\sqrt{S}O)^T$.

The choice of $O$ actually affects the estimation performance, especially when there is strong off-diagonal elements of the covariance matrix.

The paper studied three choices of $O$:

  • $O = I$, which corresponds to the ED;
  • $O = Q$ from the QR decomposition of $U\sqrt{S} = QR$, which corresponds to the CD; and
  • $O = U^T$ which leads to a symmetric matrix (i.e. MSR)

From which the following conclusions were drawn in the paper after much analysis (quoting):

  • For a to-be-transformed random variable with uncorrelated elements all the three considered MDs provide identical sigma points and hence
    they make almost no difference on quality of the [Unscented Transform] approximation. In
    such a case the CD may be preferred for its low costs.

  • If the random variable contains correlated elements, the use of different [decompositions] may significantly affect quality of the [Unscented Transform] approximation
    of the mean or covariance matrix of the transformed random variable.
    The two cases above showed that the [ED] should be preferred.

  • If the elements of the to-be-transformed variable exhibit strong correlation so that the corresponding covariance matrix is nearly
    singular, another issue must be taken into account, which is numerical
    stability of the algorithm computing the MD. The SVD is much more
    numerically stable for nearly singular covariance matrices than the
    ChD.

Reference:

  • Straka, O.; Dunik, J.; Simandl, M. & Havlik, J. “Aspects and comparison of matrix decompositions in unscented Kalman filter”, American Control Conference (ACC), 2013, 2013, 3075-3080.

Attribution
Source : Link , Question Author : Damien , Answer Author : Damien

Leave a Comment