I have a problem for which I believe I should use the hypergeometric distribution, but I can’t figure out how to do it in R.

Say I have a bag of marbles with known number (N) of marbles, but the number of successes (white marbles) in the bag (K) is unknown. I want to infer about K.

Given a sample from this population, where I see n trials and k successes, how do I infer something about the population K?

Ideally, I’d like to construct a prior distribution around K and then use the sample to update it and get a Bayesian posterior credibility interval around K (for a given credibility score α), but I am struggling to complete this practically. I have read that the conjugate prior for the hypergeometric is the beta-binomial. I thought maybe there would be an R function or package that could take the prior parameters, and then update with a sample to give me a credibility interval, but haven’t been able to find it.

If the Bayesian setting is difficult, perhaps a confidence interval would suffice… Can anyone either point me to some R functions or tutorials, or some resources that could help? Thanks.

EDIT: To add an example, I can do this in the case of the binomial distribution, to infer p, given a sample. I can construct a credibility interval with the

`binom`

package`k= 15 n= 25 library(binom) binom.bayes(k, n, conf.level = .95, tol=.005, type="central") # method x n shape1 shape2 mean lower upper sig # bayes 15 25 15.5 10.5 0.5961538 0.4057793 0.7725105 0.05`

To add a prior, because of the way updates work with the beta-binomial, I can just add counts to the k and n parameters according to the a and b parameters of the prior beta distribution.

In the beta-binomial example, N is infinite, and I’m inferring p. What I want to do is take this exact situation and just extend it to the case where N is finite (and known), and infer K (which would be equivalent to inferring p). This changes the binomial to the hypergeometric.

**Answer**

You can solve your problem using the Maximum Likelihood method rather than the Bayesian method. Just calculate the hypergeometric probability of finding k for a large number of values of K using your known values of k, N and n. The value of K that provides the maximum for this probability is the expected value of K.

**Attribution***Source : Link , Question Author : nsheff , Answer Author : Phil*