I’m trying to find the probability of getting 8 trials in a row correct in a block of 25 trials, you have 8 total blocks (of 25 trials) to get 8 trials correct in a row. The probability of getting any trial correct based on guessing is 1/3, after getting 8 in a row correct the blocks will end (so getting more than 8 in a row correct is technically not possible). How would I go about finding the probability of this occurring? I’ve been thinking along the lines of using (1/3)^8 as the probability of getting 8 in a row correct, there are 17 possible chances to get 8 in a row in a block of 25 trials, if I multiply 17 possibilities * 8 blocks I get 136, would 1-(1-(1/3)^8)^136 give me the likelihood of getting 8 in a row correct in this situation or am I missing something fundamental here?
Answer
By keeping track of things you can get an exact formula.
Let p=1/3 be the probability of success and k=8 be the number of successes in a row you want to count. These are fixed for the problem. Variable values are m, the number of trials left in the block; and j, the number of successive successes already observed. Let the chance of eventually achieving k successes in a row before m trials are exhausted be written fp,k(j,m). We seek f1/3,8(0,25).
Suppose we have just seen our jth success in a row with m>0 trials to go. The next trial is either a success, with probability p–in which case j is increased to j+1–; or else it is a failure, with probability 1−p–in which case j is reset to 0. In either case, m decreases by 1. Whence
fp,k(j,m)=pfp,k(j+1,m−1)+(1−p)fp,k(0,m−1).
As starting conditions we have the obvious results fp,k(k,m)=1 for m≥0 (i.e., we have already seen k in a row) and fp,k(j,m)=0 for k−j>m (i.e., there aren’t enough trials left to get k in a row). It is now fast and straightforward (using dynamic programming or, because this problem’s parameters are so small, recursion) to compute
fp,8(0,25)=18p8−17p9−45p16+81p17−36p18.
When p=1/3 this yields 80897/43046721≈0.0018793.
Relatively fast R
code to simulate this is
hits8 <- function() {
x <- rbinom(26, 1, 1/3) # 25 Binomial trials
x[1] <- 0 # ... and a 0 to get started with `diff`
if(sum(x) >= 8) { # Are there at least 8 successes?
max(diff(cumsum(x), lag=8)) >= 8 # Are there 8 successes in a row anywhere?
} else {
FALSE # Not enough successes for 8 in a row
}
}
set.seed(17)
mean(replicate(10^5, hits8()))
After 3 seconds of calculation, the output is 0.00213. Although this looks high, it’s only 1.7 standard errors off. I ran another 106 iterations, yielding 0.001867: only 0.3 standard errors less than expected. (As a double-check, because an earlier version of this code had a subtle bug, I also ran 400,000 iterations in Mathematica, obtaining an estimate of 0.0018475.)
This result is less than one-tenth the estimate of 1−(1−(1/3)8)136≈0.0205 in the question. But perhaps I have not fully understood it: another interpretation of “you have 8 total blocks … to get 8 trials correct in a row” is that the answer being sought equals 1−(1−f1/3,8(0,25))8)=0.0149358….
Attribution
Source : Link , Question Author : AcidNynex , Answer Author : whuber