# Can I estimate the frequency of an event based on random samplings of its occurrence?

This question is just for fun, so if it isn’t fun then please feel free to ignore it. I already get a lot of help from this site so I don’t want to bite the hand that feeds me. It’s based on a real life example and it’s just something I’ve wondered about a lot.

I visit my local dojo to train on an essentially random basis Monday-Friday. Let’s assume I visit twice a week. This means I visit exactly twice, every week, with only the two days varying. There is one individual who is nearly always there whenever I am there. If he visits on the same day as me then I will see him. Let’s assume he’s there 90% of the time when I’m there. I want to know two things:

1) how often he trains

2) whether he comes on a random basis or on set days of the week.

I’m guessing perhaps we have to assume one to guess the other? I’ve really got nowhere with this at all. I just think about it in the warm-up every week and am baffled anew. Even if somebody gave me a way in to think about the problem I would be most grateful.

Cheers!

The model is this: represent this individual’s attendance as a sequence of indicator (0/1) variables $(q_i)$, $i=1, 2, \ldots$. You randomly observe a two-element subset out of each weekly block $(q_{5k+1}, q_{5k+2}, \ldots, q_{5k+5})$. (This is a form of systematic sampling.)
1. How often does he train? You want to estimate the weekly mean of the $q_i$. The statistics you gather tell you the mean observation is 0.9. Let’s suppose this was collected over $w$ weeks. Then the Horvitz-Thompson estimator of the total number of the individual’s visits is $\sum{\frac{q_i}{\pi_i}}$ = ${5\over2} \sum{q_i}$ = ${5\over2} (2 w) 0.9$ = $4.5 w$ (where $\pi_i$ is the chance of observing $q_i$ and the sum is over your actual observations.) That is, you should estimate he trains 4.5 days per week. See the reference for how to compute the standard error of this estimate. As an extremely good approximation you can use the usual (Binomial) formulas.