James-Stein shrinkage ‘in the wild’?

I am taken by the idea of James-Stein shrinkage (i.e. that a nonlinear function of a single observation of a vector of possibly independent normals can be a better estimator of the means of the random variables, where ‘better’ is measured by squared error). However, I have never seen it in applied work. Clearly I am not well enough read. Are there any classic examples of where James-Stein has improved estimation in an applied setting? If not, is this kind of shrinkage just an intellectual curiosity?


James-Stein estimator is not widely used but it has inspired soft thresholding, hard thresholding which is really widely used.

Wavelet shrinkage estimation (see R package wavethresh) is used a lot in signal processing, shrunken centroid (package pamr under R) for classication is used for DNA micro array, there are a lot of examples of practical efficiency of shrinkage…

For theoretical purpose, see the section of candes’s review about shrinkage estimation (p20-> James stein and the section after after that one deals with soft and hard thresholding):


EDIT from the comments: why is JS shrinkage less used than Soft/hard Thresh ?

James Stein is more difficult to manipulate (practically and theoretically) and to understand intuitively than hard thresholding but the why question is a good question!

Source : Link , Question Author : shabbychef , Answer Author : robin girard

Leave a Comment