When do improper linear models get robustly beautiful?

Are improper linear models used in practice or are they some kind of curiosity described from time to time in scientific journals? If so, in what areas are they used? When would they be useful?

They are based on linear regression

y=a+biwixi+ε

but wj‘s are not coefficients estimated in the model, but are

  • equal for each variable wi=1 (unit-weighted regression),
  • based on correlations wi=ρ(y,xi) (Dana and Dawes, 2004),
  • chosen randomly (Dawes, 1979),
  • 1 for variables negatively related to y, 1 for variables positively related to y (Wainer, 1976).

I also saw features being z-scaled and the output being weighted using simple linear regression

y=a+bv+ε

where v=wix, and can be simply estimated using OLS regression.

References:
Dawes, Robyn M. (1979). The robust beauty of improper linear models in decision making. American Psychologist, 34, 571-582.

Graefe, A. (2015). Improving forecasts using equally weighted predictors. Journal of Business Research, 68(8), 1792-1799.

Wainer, Howard (1976). Estimating coefficients in linear models: It don’t make no nevermind. Psychological Bulletin 83(2), 213.

Dana, J. and Dawes, R.M. (2004). The Superiority of Simple Alternatives to Regression for Social Science Predictions. Journal of Educational and Behavioral Statistics, 29(3), 317-331.

Answer

In effect, it seems to me this is an assortment of assumed covariance structures. In other words, this is a type of Bayesian prior modelling.

This gains in robustness over an ordinary MLR procedure because the number of parameters (df) is reduced, and introduces inaccuracy because of augmented omitted variable bias, OVB. Because of the OVB, the slope is flattened, |ˆβ|<|β|, the coefficient of determination is reduced ˆR2<R2.

My personal experience is that the superior to the Bayesian approach is to use better modelling; transform parameters, use other norms, and/or use nonlinear methods. That is, once the physics of the problem and the methods are properly explored and co-ordinated, the F statistics, coefficient of determination, etc. improve rather than degrade.

Attribution
Source : Link , Question Author : Tim , Answer Author : Carl

Leave a Comment