# How can I model flips until N successes?

You and I decide to play a game where we take turns flipping a coin. The first player to flip 10 heads in total wins the game. Naturally, there is an argument about who should go first.

Simulations of this game show that the player to flips first wins 6% more than the player who flips second (the first player wins approx 53% of the time). I’m interested in modelling this analytically.

This isn’t a binomial random variable, as there are no fixed number of trials (flip until someone gets 10 heads). How can I model this? Is it the negative binomial distribution?

So as to be able to recreate my results, here is my python code:

import numpy as np
from numba import jit

@jit
def sim(N):

P1_wins = 0
P2_wins = 0

for i in range(N):

while True:

P1_wins+=1
break

P2_wins+=1
break
return P1_wins/N, P2_wins/N

a,b = sim(1000000)


The distribution of the number of tails before achieving $10$ heads is Negative Binomial with parameters $10$ and $1/2$. Let $f$ be the probability function and $G$ the survival function: for each $n\ge 0$, $f(n)$ is the player’s chance of $n$ tails before $10$ heads and $G(n)$ is the player’s chance of $n$ or more tails before $10$ heads.

Because the players roll independently, the chance the first player wins with rolling exactly $n$ tails is obtained by multiplying that chance by the chance the second player rolls $n$ or more tails, equal to $f(n)G(n)$.

Summing over all possible $n$ gives the first player’s winning chances as

That is about $3\%$ more than half the time.

In general, replacing $10$ by any positive integer $m$, the answer can be given in terms of a Hypergeometric function: it is equal to

When using a biased coin with a chance $p$ of heads, this generalizes to

Here is an R simulation of a million such games. It reports an estimate of $0.5325$. A binomial hypothesis test to compare it to the theoretical result has a Z-score of $-0.843$, which is an insignificant difference.

n.sim <- 1e6
set.seed(17)
xy <- matrix(rnbinom(2*n.sim, 10, 1/2), nrow=2)
p <- mean(xy[1,] <= xy[2,])
cat("Estimate:", signif(p, 4),
"Z-score:", signif((p - 0.532909774) / sqrt(p*(1-p)) * sqrt(n.sim), 3))