Estimating parameters using maximum likelihood estimation (MLE) involves evaluating the likelihood function, which maps the probability of the sample (X) occurring to values (x) on the parameter space (θ) given a distribution family (P(X=x|θ) over possible values of θ (note: am I right on this?). All examples I’ve seen involve calculating P(X=x|θ) by taking the product of F(X) where F is the distribution with the local value for θ and X is the sample (a vector).

Since we’re just multiplying the data, does it follow that the data be independent? E.g. could we not use MLE to fit time-series data? Or do the parameters just have to be independent?

**Answer**

The likelihood function is defined as the probability of an event E (data set x) as a function of the model parameters θ

L(θ;x)∝P(Event E;θ)=P(observing x;θ).

Therefore, there is no assumption of independence of the observations. In the classical approach there is no definition for *independence* of parameters since they are not random variables; some related concepts could be identifiability, parameter orthogonality, and independence of the Maximum Likelihood Estimators (which are random variables).

Some examples,

(1). **Discrete case**. x=(x1,...,xn) is a sample of (independent) discrete observations with P(observing xj;θ)>0, then

L(θ;x)∝n∏j=1P(observing xj;θ).

Particularly, if xj∼Binomial(N,θ), with N known, we have that

L(θ;x)∝n∏j=1θxj(1−θ)N−xj.

(2). **Continuous approximation**. Let x=(x1,...,xn) be a sample from a continuous random variable X, with distribution F and density f, with measurement error ϵ, this is, you observe the sets (xj−ϵ,xj+ϵ). Then

L(θ;x)∝n∏j=1P[observing (xj−ϵ,xj+ϵ);θ]=n∏j=1[F(xj+ϵ;θ)−F(xj−ϵ;θ)]

When ϵ is small, this can be approximated (using the Mean Value Theorem) by

L(θ;x)∝n∏j=1f(xj;θ)

For an example with the normal case, take a look at this.

(3). **Dependent and Markov model**. Suppose that x=(x1,...,xn) is a set of observations possibly dependent and let f be the joint density of x, then

L(θ;x)∝f(x;θ).

If additionally the Markov property is satisfied, then

L(θ;x)∝f(x;θ)=f(x1;θ)n−1∏j=1f(xj+1|xj;θ).

Take also a look at this.

**Attribution***Source : Link , Question Author : Felix , Answer Author : Community*