Distribution of argmax of beta-distributed random variables

Let xiBeta(αi,βi) for iI. Let j=argmaxiIxi (ties broken arbitrarily). What is the distribution of j in terms of α and β? Is there an efficient way to compute it, beyond pure sampling?


When the xi are independent for 1id with distribution functions Fi and density functions fi, respectively, the chance that xj is the largest is (by the very definition of the distribution functions)

Pr(xj=max(xi,iI))=Pr(x1xj,x2xj,,xdxj)=E[F1(xj)F2(xj)Fj1(xj)(1)Fj+1(xj)Fd(xj)]=R[F1(xj)Fj1(xj) Fj+1(xj)Fd(xj)]fj(xj)dxj.

Provided none of the \alpha_i and \beta_i are really tiny, this is straightforward to obtain through numerical integration, as shown in the R function beta.argmax below. (When there is some possibility of tiny values, fussier code will be needed because the regions of highest density may have density values that overflow double precision arithmetic. As a practical matter, “tiny” means closer to 0 than to 1.)

As an example of its use, I generated d=8 values for \alpha_i and \beta_i.

d <- 8
alpha <- rexp(d) + 0.1
beta <- rexp(d) + 0.1

I then computed the probability distribution and double-checked it with a simulation of 100,000 iterations:

p <- beta.argmax(alpha, beta, stop.on.error=FALSE) # The calculation

x <- matrix(rbeta(d * 1e5, alpha, beta), nrow=d)   # The simulated x_j
p.hat <- tabulate(apply(x, 2, which.max), nbins=d) # Summary of the argmaxes

(signif(rbind(Calculated=p, Simulated=p.hat/sum(p.hat)), 3)) # Comparison
chisq.test(p.hat, p=p)                                       # Formal comparison

The output is

             [,1]   [,2]    [,3]  [,4]  [,5]   [,6]   [,7]  [,8]
Calculated 0.0247 0.0218 0.00230 0.124 0.451 0.0318 0.0341 0.311
Simulated  0.0245 0.0217 0.00225 0.125 0.451 0.0312 0.0346 0.311

  Chi-squared test for given probabilities

data:  p.hat
X-squared = 2.468, df = 7, p-value = 0.9295

The agreement between calculation and simulation, shown in the first array, is excellent, as confirmed by the chi-squared test that follows it.

I did other tests with d as large as 200, keeping all \alpha_i and \beta_i above 0.5, and the results have been consistent with the calculations. For even larger values of d the results worsen, indicating numerical problems. (I tested up to d=500.) These were cured (at some cost in computation time, which reached one minute) by improving the error tolerances in the numerical integration.

Here is the code.

beta.argmax <- function(alpha, beta, ...) {
  lower <- min(qbeta(1e-9, alpha, beta))
  upper <- max(qbeta(1-1e-9, alpha, beta))
  p <- rep(NA_real_, length(alpha))
  for (i in seq_along(p)) {
    ff <- function(x) dbeta(x, alpha[i], beta[i], log=TRUE)
    f <- Vectorize(function(x) sum(pbeta(x, alpha[-i], beta[-i], log.p=TRUE)))
    h <- function(x) exp(ff(x) + f(x))
    p[i] <- integrate(h, lower, upper, ...)$value
  cat(sum(p), "\n") # Optional check: see how close to 1.000000 the sum is
  p / sum(p)

Source : Link , Question Author : user76284 , Answer Author : whuber

Leave a Comment