I’ve just started studying statistics and I can’t get an intuitive understanding of sufficiency. To be more precise I can’t understand how to show that the following two paragraphs are equivalent:
Roughly, given a set X of independent identically distributed data conditioned on an unknown parameter θ, a sufficient statistic is a function T(X) whose value contains all the information needed to compute any estimate of the parameter.
A statistic T(X) is sufficient for underlying parameter θ precisely if the conditional probability distribution of the data X, given the statistic T(X), does not depend on the parameter θ.
(I’ve taken the quotes from Sufficient statistic)
Though I understand the second statement, and I can use the factorization theorem to show if a given statistic is sufficient, I can’t understand why a statistic with such a property has also the property that it “contains all the information needed to compute any estimate of the parameter”. I am not looking for a formal proof, which would help anyway to refine my understanding, I’d like to get an intuitive explanation of why the two statements are equivalent.
To recap, my questions are: why the two statements are equivalent? Could someone provide an intuitive explanation for their equivalence?
Following the comments of @whuber and @Kamster, I probably got a better understanding. When we say that a sufficient statistic contains all the information needed to compute any estimate of the parameter, what we actually mean is that it is enough to compute the maximum likelihood estimator (which is a function of all sufficient statistics).
Given that I am answering my own question, and so I am not 100% sure of the answer, I will not mark it as correct until I get some feedback. Please add any comment and down-vote if you think I am being wrong/imprecise/etc…
(Let me know if this is not compatible with SE etiquette, being this my first question I beg your clemency if I am violating any rule)