# Probability of a run of k successes in a sequence of n Bernoulli trials

I’m trying to find the probability of getting 8 trials in a row correct in a block of 25 trials, you have 8 total blocks (of 25 trials) to get 8 trials correct in a row. The probability of getting any trial correct based on guessing is 1/3, after getting 8 in a row correct the blocks will end (so getting more than 8 in a row correct is technically not possible). How would I go about finding the probability of this occurring? I’ve been thinking along the lines of using (1/3)^8 as the probability of getting 8 in a row correct, there are 17 possible chances to get 8 in a row in a block of 25 trials, if I multiply 17 possibilities * 8 blocks I get 136, would 1-(1-(1/3)^8)^136 give me the likelihood of getting 8 in a row correct in this situation or am I missing something fundamental here?

By keeping track of things you can get an exact formula.

Let $p=1/3$ be the probability of success and $k=8$ be the number of successes in a row you want to count. These are fixed for the problem. Variable values are $m$, the number of trials left in the block; and $j$, the number of successive successes already observed. Let the chance of eventually achieving $k$ successes in a row before $m$ trials are exhausted be written $f_{p,k}(j,m)$. We seek $f_{1/3,8}(0,25)$.

Suppose we have just seen our $j^\text{th}$ success in a row with $m\gt0$ trials to go. The next trial is either a success, with probability $p$–in which case $j$ is increased to $j+1$–; or else it is a failure, with probability $1-p$–in which case $j$ is reset to $0$. In either case, $m$ decreases by $1$. Whence

As starting conditions we have the obvious results $f_{p,k}(k,m)=1$ for $m \ge 0$ (i.e., we have already seen $k$ in a row) and $f_{p,k}(j,m)=0$ for $k-j \gt m$ (i.e., there aren’t enough trials left to get $k$ in a row). It is now fast and straightforward (using dynamic programming or, because this problem’s parameters are so small, recursion) to compute

When $p=1/3$ this yields $80897 / 43046721 \approx 0.0018793$.

Relatively fast R code to simulate this is

hits8 <- function() {
x <- rbinom(26, 1, 1/3)                # 25 Binomial trials
x[1] <- 0                              # ... and a 0 to get started with diff
if(sum(x) >= 8) {                      # Are there at least 8 successes?
max(diff(cumsum(x), lag=8)) >= 8   # Are there 8 successes in a row anywhere?
} else {
FALSE                              # Not enough successes for 8 in a row
}
}
set.seed(17)
mean(replicate(10^5, hits8()))


After 3 seconds of calculation, the output is $0.00213$. Although this looks high, it’s only 1.7 standard errors off. I ran another $10^6$ iterations, yielding $0.001867$: only $0.3$ standard errors less than expected. (As a double-check, because an earlier version of this code had a subtle bug, I also ran 400,000 iterations in Mathematica, obtaining an estimate of $0.0018475$.)

This result is less than one-tenth the estimate of $1-(1-(1/3)^8)^{136} \approx 0.0205$ in the question. But perhaps I have not fully understood it: another interpretation of “you have 8 total blocks … to get 8 trials correct in a row” is that the answer being sought equals $1 - (1 - f_{1/3,8}(0,25))^8) = 0.0149358...$.