**Answer**

It may be that the math notation is somehow intimidating, but it’s not that daunting. Take a look at what happens with an un-biased estimator, such as the sample mean:

The difference between the expectation of the means of the samples we get from a population with mean θ and that population parameter, θ, itself is zero, because the sample means will be all distributed around the population mean. None of them will be the population mean exactly, but the mean of all the sample means will be exactly the population mean.

This is not the case for other parameters, such as the variance, for which the variance observed in the sample tends to be too small in comparison to the true variance. So if we want to estimate the population variance from the sample we divide by n−1, instead of n (Bassel’s correction) to correct the bias of the sample variance as an estimator of the population variance:

In both instances, the sample is governed by the population parameter θ, explaining the part in red in the defining equation: BiasE[ˉθ]=Ep(X|θ)[ˉθ]−θ. However, the ˉθ steers away from θ when the estimator is biased.

**Attribution***Source : Link , Question Author : N.Der , Answer Author : Antoni Parellada*