While I am studying Maximum Likelihood Estimation, to do inference in Maximum Likelihood Estimaion, we need to know the variance. To find out the variance, I need to know the Cramer’s Rao Lower Bound, which looks like a Hessian Matrix with Second Deriviation on the curvature. I am kind of mixed up to define the relationship between covariance matrix and hessian matrix. Hope to hear some explanations about the question. A simple example will be appreciated.
Answer
You should first check out this Basic question about Fisher Information matrix and relationship to Hessian and standard errors
Suppose we have a statistical model (family of distributions) {fθ:θ∈Θ}. In the most general case we have dim(Θ)=d, so this family is parameterized by θ=(θ1,…,θd)T. Under certain regularity conditions, we have
Ii,j(θ)=−Eθ[∂2l(X;θ)∂θi∂θj]=−Eθ[Hi,j(l(X;θ))]
where Ii,j is a Fisher Information matrix (as a function of θ) and X is the observed value (sample)
l(X;θ)=ln(fθ(X)), for some θ∈Θ
So Fisher Information matrix is a negated expected value of Hesian of the log-probability under some θ
Now let’s say we want to estimate some vector function of the unknown parameter ψ(θ). Usually it is desired that the estimator T(X)=(T1(X),…,Td(X)) should be unbiased, i.e.
∀θ∈Θ Eθ[T(X)]=ψ(θ)
Cramer Rao Lower Bound states that for every unbiased T(X) the covθ(T(X)) satisfies
covθ(T(X))≥∂ψ(θ)∂θI−1(θ)(∂ψ(θ)∂θ)T=B(θ)
where A≥B for matrices means that A−B is positive semi-definite, ∂ψ(θ)∂θ is simply a Jacobian Ji,j(ψ). Note that if we estimate θ, that is ψ(θ)=θ, above simplifies to
covθ(T(X))≥I−1(θ)
But what does it tell us really? For example, recall that
varθ(Ti(X))=[covθ(T(X))]i,i
and that for every positive semi-definite matrix A diagonal elements are non-negative
∀i Ai,i≥0
From above we can conclude that the variance of each estimated element is bounded by diagonal elements of matrix B(θ)
∀i varθ(Ti(X))≥[B(θ)]i,i
So CRLB doesn’t tell us the variance of our estimator, but wheter or not our estimator is optimal, i.e. if it has lowest covariance among all unbiased estimators.
Attribution
Source : Link , Question Author : user122358 , Answer Author : Community