Are there any circumstances where stepwise regression should be used?

Stepwise regression had been overused in many biomedical papers in the past but this appears to be improving with better education of its many issues. Many older reviewers however do still ask for it. What are the circumstances where stepwise regression has a role and should be used, if any?


I am not aware of situations, in which stepwise regression would be the preferred approach. It may be okay (particularly in its step-down version starting from the full model) with bootstrapping of the whole stepwise process on extremely large datasets with $n>>p$. Here $n$ is the number of observations in an continuous outcome (or number of records with an event in survival analysis) $p$ is the number of candidate predictors including all considered interactions – i.e. when any even small effects become very clear and it does not matter so much how your do your model building (that would mean that $n$ would be much larger than $p$ than by substantially more than the sometimes quoted factor of 20).

Of course the reason most people are tempted to do something like stepwise regression is,

  1. because it is not computationally intensive (if you do not do the proper bootstrapping, but then your results are pretty unreliable),
  2. because it provides clear cut “is in the model” versus “is not in the model” statements (which are very unreliable in standard stepwise regression; something that proper bootstrapping will usually make clear so that these statements will usually not be so clear) and
  3. because often $n$ is smaller, close to or just a bit larger than $p$.

I.e. a method like stepwise regression would (if it had good operating characteristics) be especially attractive in those situations, when it does not have good operating characteristics.

Source : Link , Question Author : bobmcpop , Answer Author : Björn

Leave a Comment