How can I model flips until N successes?

You and I decide to play a game where we take turns flipping a coin. The first player to flip 10 heads in total wins the game. Naturally, there is an argument about who should go first.

Simulations of this game show that the player to flips first wins 6% more than the player who flips second (the first player wins approx 53% of the time). I’m interested in modelling this analytically.

This isn’t a binomial random variable, as there are no fixed number of trials (flip until someone gets 10 heads). How can I model this? Is it the negative binomial distribution?

So as to be able to recreate my results, here is my python code:

import numpy as np
from numba import jit

def sim(N):

    P1_wins = 0
    P2_wins = 0

    for i in range(N):

        P1_heads = 0
        P2_heads = 0
        while True:

            P1_heads += np.random.randint(0,2)

            if P1_heads == 10:

            P2_heads+= np.random.randint(0,2)
            if P2_heads==10:
    return P1_wins/N, P2_wins/N

a,b = sim(1000000)


The distribution of the number of tails before achieving 10 heads is Negative Binomial with parameters 10 and 1/2. Let f be the probability function and G the survival function: for each n0, f(n) is the player’s chance of n tails before 10 heads and G(n) is the player’s chance of n or more tails before 10 heads.

Because the players roll independently, the chance the first player wins with rolling exactly n tails is obtained by multiplying that chance by the chance the second player rolls n or more tails, equal to f(n)G(n).

Summing over all possible n gives the first player’s winning chances as


That is about 3% more than half the time.

In general, replacing 10 by any positive integer m, the answer can be given in terms of a Hypergeometric function: it is equal to


When using a biased coin with a chance p of heads, this generalizes to


Here is an R simulation of a million such games. It reports an estimate of 0.5325. A binomial hypothesis test to compare it to the theoretical result has a Z-score of 0.843, which is an insignificant difference.

n.sim <- 1e6
xy <- matrix(rnbinom(2*n.sim, 10, 1/2), nrow=2)
p <- mean(xy[1,] <= xy[2,])
cat("Estimate:", signif(p, 4), 
    "Z-score:", signif((p - 0.532909774) / sqrt(p*(1-p)) * sqrt(n.sim), 3))

Source : Link , Question Author : Demetri Pananos , Answer Author : whuber

Leave a Comment