Does MLE require i.i.d. data? Or just independent parameters?

Estimating parameters using maximum likelihood estimation (MLE) involves evaluating the likelihood function, which maps the probability of the sample (X) occurring to values (x) on the parameter space (θ) given a distribution family (P(X=x|θ) over possible values of θ (note: am I right on this?). All examples I’ve seen involve calculating P(X=x|θ) by taking the product of F(X) where F is the distribution with the local value for θ and X is the sample (a vector).

Since we’re just multiplying the data, does it follow that the data be independent? E.g. could we not use MLE to fit time-series data? Or do the parameters just have to be independent?

Answer

The likelihood function is defined as the probability of an event E (data set x) as a function of the model parameters θ

L(θ;x)P(Event E;θ)=P(observing x;θ).

Therefore, there is no assumption of independence of the observations. In the classical approach there is no definition for independence of parameters since they are not random variables; some related concepts could be identifiability, parameter orthogonality, and independence of the Maximum Likelihood Estimators (which are random variables).

Some examples,

(1). Discrete case. x=(x1,...,xn) is a sample of (independent) discrete observations with P(observing xj;θ)>0, then

L(θ;x)nj=1P(observing xj;θ).

Particularly, if xjBinomial(N,θ), with N known, we have that

L(θ;x)nj=1θxj(1θ)Nxj.

(2). Continuous approximation. Let x=(x1,...,xn) be a sample from a continuous random variable X, with distribution F and density f, with measurement error ϵ, this is, you observe the sets (xjϵ,xj+ϵ). Then

L(θ;x)nj=1P[observing (xjϵ,xj+ϵ);θ]=nj=1[F(xj+ϵ;θ)F(xjϵ;θ)]

When ϵ is small, this can be approximated (using the Mean Value Theorem) by

L(θ;x)nj=1f(xj;θ)

For an example with the normal case, take a look at this.

(3). Dependent and Markov model. Suppose that x=(x1,...,xn) is a set of observations possibly dependent and let f be the joint density of x, then

L(θ;x)f(x;θ).

If additionally the Markov property is satisfied, then

L(θ;x)f(x;θ)=f(x1;θ)n1j=1f(xj+1|xj;θ).

Take also a look at this.

Attribution
Source : Link , Question Author : Felix , Answer Author : Community

Leave a Comment