# Does likelihood ratio test control for overfitting?

I have two nested logistic regression models, A and B. A is nested under B. Let’s say B has $K$ more features than A. B has a higher log likelihood than A. However the improved likelihood of B is due to the fact that the $K$ features easily overfit the data. If I apply the likelihood ratio test in my case, it suggests that the more complicated model, B, has a significant improvement. So I think that likelihood ratio test is flawed in such a case.

• How can we determine whether the added features cause the overfitting problem?
• Does likelihood ratio test always return the correct answer?

Your reasoning is too pessimistic.

Given the $K$ additional features, the LR test statistic will follow an asymptotic $\chi^2$ distribution with $K$ degrees of freedom if the null is true (and other auxiliary assumptions, e.g., a suitable regression setting, weak dependence assumptions etc.), i.e., if the additional predictors in $B$ are just noise features that lead to “overfitting”.

The figure below plots the 0.95%-quantiles of the $\chi^2_K$ distribution as a function of $K$, i.e. the value that the LR statistic needs to exceed to reject the null that $A$ is the “good” model. As you can see, higher and higher values of the test statistic are needed the larger your set in $B$ that “overfits” the data. So the test suitably makes it more difficult for the (inevitable) better fit (or log-likelihood) of the larger model to be judged “sufficiently” large to reject model $A$.

Of course, for any given application of the test, you might get spurious overfitting that is so “good” that you still falsely reject the null. This “type-I” error is however inherent in any statistical test, and will occur in about 5% of the cases in which the null is true if (like in the figure) we use the 95%-quantiles of the test’s null distribution as our critical values.