Meaning of completeness of a statistic? [duplicate]

From Wikipedia:

The statistic s is said to be complete for the distribution of X if for every measurable function g (which must be independent of parameter θ) the following implication holds:

\mathbb{E}_\theta[g(s(X))] = 0, \forallθ \text{ implies that }P_θ(g(s(X)) = 0) = 1, \forall θ.

The statistic s is said to be boundedly complete if the implication holds for all bounded functions g.

I read and agree with Xi’an and phaneron that a complete statistic means that “there can only be one unbiased estimator based on it”.

  1. But I don’t understand what Wikipedia says at the beginning of the same article:

    In essence, it (completeness is a property of a statistic) is a condition which ensures that the parameters of the probability distribution representing the model can all be estimated on the basis of the statistic: it ensures that the distributions corresponding to different values of the parameters are distinct.

    • In what sense (and why) does completeness “ensures that the distributions corresponding to different values of the parameters are distinct”? is “the distributions” the distributions of a complete statistic?

    • In what sense (and why) does completeness “ensures that the parameters of the probability distribution representing the model can all be estimated on the basis of the statistic”?

  2. [Optional: What does “bounded completeness” mean, compared to completeness?]

Answer

This is a very good question and one I’ve struggled with for quite some time. Here’s how I’ve decided to think about it:

Take the contrapositive of the definition as stated in Wikipedia (which doesn’t change the logical meaning at all):

\begin{align}
{\rm If}\quad &\neg\ \forall \theta\ P(g(T(x))=0)=1 \\
{\rm then}\quad &\neg\ \forall \theta\ E(g(T(x))) = 0
\end{align}

In other words, if there is a parameter value such that g(T(x)) is not almost surely 0, then there is a parameter value such that the expected value of that statistic is not 0.

Hmm. What does that even mean?

Let’s ask what happens when T(x) is NOT complete…

A statistic T(x) that is NOT complete will have at least one parameter value such that g(T(x)) is not almost surely 0 for that value, and yet it’s expected value is 0 for all parameter values (including this one).

In other words, there are values of \theta for which g(T(x)) has a non-trivial distribution around it (it has some random variation in it), and yet the expected value of g(T(x)) is nonetheless always 0–it doesn’t budge, no matter how much \theta is different.

A complete statistic, on the other have, will budge in it’s expected value eventually if g(T(x)) is non-trivially distributed and centered at 0 for some \theta.

Put another way, if we find a function g(\cdot) where the expected value is zero for some \theta value (say \theta_0) and it has a non-trivial distribution given that value of \theta, then there must be another value of \theta out there (say, \theta_1 \ne \theta_0) that results in a different expectation for g(T(x)).

This means we can actually use this statistic for hypothesis testing and informative estimation in the context of an assumed distribution for our data. We want to be able to center it around a hypothesized value of \theta and get it to have expectation 0 for that hypothesized value of \theta, but not for all other values of \theta. But if the statistic is not complete we may not be able to do this: we may be unable to reject any hypothesized values of \theta. But then we can’t build confidence intervals and do statistical estimation.

Attribution
Source : Link , Question Author : Tim , Answer Author : Semoi

Leave a Comment