In order to simulate a normal distribution from a set of uniform variables, there are several techniques:
The Box-Muller algorithm, in which one samples two independent uniform variates on (0,1) and transforms them into two independent standard normal distributions via:
the CDF method, where one can equate the normal cdf (F(Z)) to a Uniform variate:
My question is: which is computationally more efficient? I would think
its the latter method – but most of the papers I read use Box-Muller –
The inverse of the normal CDF is know and given by:
From a purely probabilistic perspective, both approaches are correct and hence equivalent. From an algorithmic perspective, the comparison must consider both the precision and the computing cost.
Box-Muller relies on a uniform generator and costs about the same as this uniform generator. As mentioned in my comment, you can get away without sine or cosine calls, if not without the logarithm:
- generate U1,U2iid∼U(−1,1) until S=U21+U22≤1
- take Z=√−2log(S)/S and define X1=ZU1, X2=ZU2
The generic inversion algorithm requires the call to the inverse normal cdf, for instance
qnorm(runif(N)) in R, which may be more costly than the above and more importantly may fail in the tails in terms of precision, unless the quantile function is well-coded.
To follow on comments made by whuber, the comparison of
qnorm(runif(N))is at the advantage of the inverse cdf, both in execution time:
> system.time(qnorm(runif(10^8))) sutilisateur système écoulé 10.137 0.120 10.251 > system.time(rnorm(10^8)) utilisateur système écoulé 13.417 0.060 13.472` `
using the more accurate R benchmark
test replications elapsed relative user.self sys.self 3 box-muller 100 0.103 1.839 0.103 0 2 inverse 100 0.056 1.000 0.056 0 1 R rnorm 100 0.064 1.143 0.064 0
and in terms of fit in the tail:
Following a comment of Radford Neal on my blog, I want to point out that the default
rnorm in R makes use of the inversion method, hence that the above comparison is reflecting on the interface and not on the simulation method itself! To quote the R documentation on RNG:
‘normal.kind’ can be ‘"Kinderman-Ramage"’, ‘"Buggy Kinderman-Ramage"’ (not for ‘set.seed’), ‘"Ahrens-Dieter"’, ‘"Box-Muller"’, ‘"Inversion"’ (the default), or ‘"user-supplied"’. (For inversion, see the reference in ‘qnorm’.) The Kinderman-Ramage generator used in versions prior to 1.7.1 (now called ‘"Buggy"’) had several approximation errors and should only be used for reproduction of old results. The ‘"Box-Muller"’ generator is stateful as pairs of normals are generated and returned sequentially. The state is reset whenever it is selected (even if it is the current normal generator) and when ‘kind’ is changed.