I know that the statement in question is wrong because estimators cannot have asymptotic variances that are lower than the CramerRao bound.
However, if asymptotic consistence means that an estimator converges in probability to a value, then doesn’t this also mean that its variance becomes 0?
Where in this train of thought am I wrong?
Answer
Convergence of a sequence of random variables in probability does not imply convergence of their variances, nor even that their variances get anywhere near $0.$ In fact, their means may converge to a constant yet their variances can still diverge.
Examples and counterexamples
Construct counterexamples by creating ever more rare events that are increasingly far from the mean: the squared distance from the mean can overwhelm the decreasing probability and cause the variance to do anything (as I will proceed to show).
For instance, scale a Bernoulli$(1/n)$ variate by $n^{p}$ for some power $p$ to be determined. That is, define the sequence of random variables $X_n$ by
$$\begin{aligned}
&\Pr(X_n=n^{p})=1/n \\
&\Pr(X_n=0)= 1 – 1/n.
\end{aligned}$$
As $n\to \infty$, because $\Pr(X_n=0)\to 1$ this converges in probability to $0;$ its expectation $n^{p1}$ even converges to $0$ provided $p\lt 1;$ but for $p\gt 1/2$ its variance $n^{2p1}(11/n)$ diverges.
Comments
Many other behaviors are possible:

Because negative powers $2p1$ of $n$ converge to $0,$ the variance
converges to $0$ for $p\lt 1/2:$ the variables “squeeze down” to $0$
in some sense. 
An interesting edge case is $p=1/2,$ for which the variance converges
to $1.$ 
By varying $p$ above and below $1/2$ depending on $n$ you can even
make the variance not converge at all. For instance, let $p(n)=0$
for even $n$ and $p(n)=1$ for odd $n.$
A direct connection with estimation
Finally, a reasonable possible objection is that abstract sequences of random variables are not really “estimators” of anything. But they can nevertheless be involved in estimation. For instance, let $t_n$ be a sequence of statistics, intended to estimate some numerical property $\theta(F)$ of the common distribution of an (arbitrarily large) iid random sample $(Y_1,Y_2,\ldots,Y_n,\ldots)$ of $F.$ This induces a sequence of random variables
$$T_n = t_n(Y_1,Y_2,\ldots,Y_n).$$
Modify this sequence by choosing any value of $p$ (as above) you like and set
$$T^\prime_n = T_n + (X_n – n^{p1}).$$
The parenthesized term makes a zeromean adjustment to $T_n,$ so that if $T_n$ is a reasonable estimator of $\theta(F),$ then so is $T^\prime_n.$ (With some imagination we can conceive of situations where $T_n^\prime$ could yield better estimates than $T_n$ with probability close to $1.$) However, if you make the $X_n$ independent of $Y_1,\ldots, Y_n,$ the variance of $T^\prime_n$ will be the sum of the variances of $T_n$ and $X_n,$ which you thereby can cause to diverge.
Attribution
Source : Link , Question Author : Heisenberg , Answer Author : whuber