Formula for dropping dice (non-brute force)

First of all I’m not sure where this question should be posted. I’m asking if a statistics problem is NP-Complete and if not to solve it programmatically. I’m posting it here because the statistics problem is the center point.

I’m trying to find a better formula for solving a problem. The problem is: if I have 4d6 (4 ordinary 6 sided dice) and roll them all at once, remove a die with the lowest number (called “dropping”), then sum the remaining 3, what is the probability of each possible result? I know the answer is this:

Sum (Frequency): Probability
3   (1):         0.0007716049
4   (4):         0.0030864198
5   (10):        0.0077160494
6   (21):        0.0162037037
7   (38):        0.0293209877
8   (62):        0.0478395062
9   (91):        0.0702160494
10  (122):       0.0941358025
11  (148):       0.1141975309
12  (167):       0.1288580247
13  (172):       0.1327160494
14  (160):       0.1234567901
15  (131):       0.1010802469
16  (94):        0.0725308642
17  (54):        0.0416666667
18  (21):        0.0162037037

The average is 12.24 and standard deviation is 2.847.

I found the above answer by brute force and don’t know how or if there is a formula for it. I suspect this problem is NP-Complete and therefore can only be solved by brute force. It might be possible to get all probabilities of 3d6 (3 normal 6 sided dice) then skew each of them upward. This would be faster than brute force because I have a fast formula when all dice are kept.

I programmed the formula for keeping all dice in college. I had asked my statistics professor about it and he found this page, which he then explained to me. There is a big performance difference between this formula and brute force: 50d6 took 20 seconds but 8d6 drop lowest crashes after 40 seconds (chrome runs out of memory).

Is this problem NP-Complete? If yes please provide a proof, if no please provide a non-brute force formula to solve it.

Note that I don’t know much about NP-Complete so I might be thinking of NP, NP-Hard, or something else. The proof for NP-Completeness is useless to me the only reason why I ask for it is to prevent people from guessing. And please bare with me as it’s been a long time since I worked on this: I don’t remember statistics as well as I might need to solve this.

Ideally I’m looking for a more generic formula for X number of dice with Y sides when N of them are dropped but am starting with something much more simple.

Edit:

I would also prefer the formula to output frequencies but it is acceptable to only output probabilities.

For those interested I have programmed whuber’s answer in JavaScript on my GitHub (in this commit only the tests actually use the functions defined).

Answer

Solution

Let there be n=4 dice each giving equal chances to the outcomes 1,2,,d=6. Let K be the minimum of the values when all n dice are independently thrown.

Consider the distribution of the sum of all n values conditional on K. Let X be this sum. The generating function for the number of ways to form any given value of X, given that the minimum is at least k, is

f(n,d,k)(x)=xk+xk+1++xd=xk1xdk+11x.

Since the dice are independent, the generating function for the number of ways to form values of X where all n dice show values of k or greater is

f(n,d,k)(x)n=xkn(1xdk+11x)n.

This generating function includes terms for the events where K exceeds k, so we need to subtract them off. Therefore the generating function for the number of ways to form values of X, given K=k, is

f(n,d,k)(x)nf(n,d,k+1)(x)n.

Noting that the sum of the n1 highest values is the sum of all values minus the smallest, equal to XK. The generating function therefore needs to be divided by k. It becomes a probability generating function upon multiplying by the common chance of any combination of dice, (1/d)n:

dndk=1xk(f(n,d,k)(x)nf(n,d,k+1)(x)n).

Since all the polynomial products and powers can be computed in O(nlogn) operations (they are convolutions and therefore can be carried out with the discrete Fast Fourier Transform), the total computational effort is O(knlogn). In particular, it is a polynomial time algorithm.


Example

Let’s work through the example in the question with n=4 and d=6.

Formula (1) for the PGF of X conditional on Kk gives

f(4,6,1)(x)=x+x2+x3+x4+x5+x6f(4,6,2)(x)=x2+x3+x4+x5+x6f(4,6,5)(x)=x5+x6f(4,6,6)(x)=x6f(4,6,7)(x)=0.

Raising them to the n=4 power as in formula (2) produces

f(4,6,1)(x)4=x4+4x5+10x6++4x23+x24f(4,6,2)(x)4=x8+4x9+10x10++4x23+x24f(4,6,5)(x)4=x20+4x21+6x22+4x23+x24f(4,6,6)(x)4=x24f(4,6,7)(x)4=0

Their successive differences in formula (3) are

f(4,6,1)(x)4f(4,6,2)(x)4=x4+4x5+10x6++12x18+4x19f(4,6,2)(x)4f(4,6,3)(x)4=x8+4x9+10x10++4x20f(4,6,5)(x)4f(4,6,6)(x)4=x20+4x21+6x22+4x23f(4,6,6)(x)4f(4,6,7)(x)4=x24.

The resulting sum in formula (4) is

64(x3+4x4+10x5+21x6+38x7+62x8+91x9+122x10+148x11+167x12+172x13+160x14+131x15+94x16+54x17+21x18).

For example, the chance that the top three dice sum to 14 is the coefficient of x14, equal to

64×160=10/81=0.123456790123456.

It is in perfect agreement with the probabilities quoted in the question.

By the way, the mean (as calculated from this result) is 15869/129612.244598765 and the standard deviation is 13612487/16796162.8468444.

A similar (unoptimized) calculation for n=400 dice instead of n=4 took less than a half a second, supporting the contention that this is not a computationally demanding algorithm. Here is a plot of the main part of the distribution:

Figure

Since the minimum K is highly likely to equal 1 and the sum X will be extremely close to having a Normal(400×7/2,400×35/12) distribution (whose mean is 1400 and standard deviation is approximately 34.1565), the mean must be extremely close to 14001=1399 and the standard deviation extremely close to 34.16. This nicely describes the plot, indicating it is likely correct. In fact, the exact calculation gives a mean of around 2.13×1032 greater than 1399 and a standard deviation around 1.24×1031 less than 400×35/12.

Attribution
Source : Link , Question Author : SkySpiral7 , Answer Author : whuber

Leave a Comment