Confidence intervals for empirical CDF

I have 100 data points from a random process. How would I go about placing a confidence interval around the estimate of $\Pr(X>x)$? The distribution function is unknown and positively skewed. My first inclination would be to use a bootstrap based on the material I have read for this class, but is there some other way to do this?

Answer

Yes, there are other types of confidence intervals (CI). One of the most popular CI are based on the Dvoretzky–Kiefer–Wolfowitz inequality, which states that

$$P\left[\sup_{x}\vert \hat{F}_n(x)-F(x)\vert>\epsilon\right]\leq 2\exp(-2n\epsilon^2).$$

Then, if you want to construct an interval of level $\alpha$ you just have to equate $\alpha=2\exp(-2n\epsilon^2)$, which leads to $\epsilon = \sqrt{\dfrac{1}{2n}\log\left(\dfrac{2}{\alpha}\right)}$. Consequently, a confidence band for $F(x)$ is $L(x)=\max\{\hat{F}_n(x)-\epsilon,0\}$ and $U(x)=\min\{\hat{F}_n(x)+\epsilon,1\}$. You may want to work out the details and adapt this to $P[X>x]=1-F(x)$ (since you tagged this as self-study).

This presentation provides other details that might be of interest.

Attribution
Source : Link , Question Author : Eric Brady , Answer Author : Rodrigo Zepeda

Leave a Comment