Suppose we estimate a quantity $\theta_0$ by the $\tilde{\theta} = \hat{\theta}(\eta)$ that solves the estimating equation

$$S_n(\tilde{\theta}, \eta_0) = 0$$

where $\eta_0$ is a nuisance parameter that is known. Suppose that the assumptions of the M-estimator are satisfied, and

$$\tilde{\theta} \xrightarrow{p}\ \theta_0$$

so that consistency is achieved.

Question: Let suppose we do not know $\eta_0$, but we have a consistent estimator $\hat{\eta}$ of $\eta_0$. If now $\hat{\theta} = \hat{\theta}(\hat{\eta})$, under which condition do we have consistency?Clearly if we estimate $\hat{\eta}$ through an estimating equation we can stack all our estimating equations and thus obtain consistency automatically.

However, what if $\hat{\eta}$ is not obtained through an estimating equation?

**Answer**

**Background**: For consistency when $\eta_0$ is known, we typically need a function $S(\theta, \eta)$ such that for every $\epsilon > 0$ we have

$$\sup_{\theta \in \Theta} \frac{| S_n(\theta,\eta_0) – S(\theta,\eta_0) |}{1 + | S_n(\theta,\eta_0)| + |S(\theta,\eta_0) |} \xrightarrow{p}\ 0$$

$$\inf_{|\theta – \theta_0| > \delta}| S(\theta,\eta_0) | > 0 = |S(\theta_0, \eta_0)|$$

with $S_n(\tilde{\theta},\eta_0) = op(1)$.

Note that a more restrictive version of the first assumption is

$$\sup_{\theta \in \Theta} | S_n(\theta,\eta_0) – S(\theta,\eta_0) | \xrightarrow{p}\ 0$$

From the infimum condition, for any $\delta >0 $ we have an $\epsilon

> 0$ such that

$$ P\left( \left| \tilde{\theta} – \theta_0 \right| > \delta \right) \le

P\left( \left| S(\tilde{\theta},\eta_0) \right| \ge \epsilon \right) $$

Consistency can then be proved through

$$ \begin{align}| S(\tilde{\theta},\eta_0) | &\le | S_n(\tilde{\theta}, \eta_0) | + |S(\tilde{\theta},\eta_0) – S_n(\tilde{\theta}, \eta_0) | \\

&\le op(1) + op(1+|S_n(\tilde{\theta},\eta_0)| + |S(\tilde{\theta}, \eta_0)|) \\

&= op(1 + S(\tilde{\theta}, \eta_0)) = op(1) \end{align}

$$

Hence $P\left( | S(\tilde{\theta},\eta_0) | \ge \epsilon \right) \to 0$ which proves consistency.

**Solution**:

Suppose that in addition to the previous assumptions, *either*

(1) $S_n(\theta,\eta)$ is stochastically continuous uniformly in $\theta$ with respect to $\eta$ at $\eta_0$

or

(2) $S(\theta,\eta)$ is continuous uniformly in $\theta$ with respect to $\eta$ at $\eta_0$

with $S_n(\hat{\theta},\hat{\eta}) = op(1)$.

If **(1)** is true the proof is trivial, with

$$ \begin{align} |S_n(\hat{\theta},\hat{\eta})| &\le |S_n(\hat{\theta},\eta_0)| + |S_n(\hat{\theta},\hat{\eta}) – S_n(\hat{\theta},\eta_0)| \\

&\le |S_n(\hat{\theta},\eta_0)| + \sup_{\theta \in \Theta}|S_n(\theta,\hat{\eta}) – S_n(\theta,\eta_0)| \\

&= |S_n(\hat{\theta},\eta_0)| + op(1)

\end{align}$$

with the last line true because of (1).

We conclude that the $\hat{\theta}$ also satisfies $S_n(\hat{\theta},\eta_0) = op(1)$, and the theory in the background can be applied automatically.

If **(2)** is true, from the infimum condition, we get that for any $\delta >0 $ we have an $\epsilon_1 > 0$ and $\epsilon_2 > 0$ such that

$$\inf_{\theta :|\theta-\theta_0| > \delta}\inf_{|\eta -\eta_0| \le \epsilon_2 }| S(\theta,\eta) | > \epsilon_1 $$

Therefore, we have

$$ P\left( \left| \hat{\theta} – \theta_0 \right| > \delta \right) \le

P\left( \left| S(\hat{\theta},\hat{\eta}) – S(\theta_0,\hat{\eta}) \right| > \epsilon_1 \right) + P(|\hat{\eta} – \eta_0| > \epsilon_2)$$

The last term goes to zero as $n \to \infty$.

Then, we have

$$ \begin{align}

| S(\hat{\theta},\hat{\eta}) – S(\theta_0,\hat{\eta}) | &\le

|S(\hat{\theta},\eta_0) – S(\theta_0,\eta_0)| \\

&+

|S(\hat{\theta},\hat{\eta}) – S(\hat{\theta},\eta_0)| +

|S( \theta_0,\hat{\eta}) – S(\theta_0,\eta_0)|

\\

&\le op(1) + 2\sup_{\theta \in \Theta}|S( \theta,\hat{\eta}) – S(\theta,\eta_0)| \\ &= op(1) \end{align}

$$

where the last line is true because of (2).

**Attribution***Source : Link , Question Author : Guillaume F. , Answer Author : Guillaume F.*