Estimating number of balls by successively selecting a ball and marking it

Lets say I have N balls in a bag. On my first draw, I mark the ball and replace it in the bag. On my second draw, if I pick up a marked ball I return it to the bag. If, however I pick up a non-marked ball then I mark it and return it to the bag. I continue this for any number of draws. What is the expected number of balls in the bag given a number of draws and the marked/unmarked history of draws?


Here is an idea. Let I be a finite subset of the natural numbers which will serve as the possible values for N. Suppose we have a prior distribution over I. Fix a non-random positive integer M. Let k be the random variable denoting the number of times we mark a ball in M draws from the bag. The goal is to find E(N|k). This will be function of M,k and the prior.

By Bayes rule we have


Computing P(k|N=j) is a known calculation which is a variant on the coupon collectors problem. P(k|N=j) is the probability that we observe k distinct coupons in M draws when there are j coupons in total. See here for an argument for

P(k|N=j) = \frac{\binom{j}{k}k!S(M,k)}{j^M}

where S denotes a stirling number of the second kind. We can then calculate

E(N|k) = \sum_{j \in \mathcal{I}}jP(N=j|k)

Below are some calculations for various k and M. In each case we use a uniform prior on [k,10k]

M & k & E(N)\\\hline
10 & 5 & 7.99 \\
15 & 5 & 5.60 \\
15 & 10 & 23.69\\
30 & 15 & 20.00\\
30 & 20 & 39.53 \\

Source : Link , Question Author : ashemon , Answer Author : user35546

Leave a Comment