I am taken by the idea of James-Stein shrinkage (i.e. that a nonlinear function of a single observation of a vector of possibly independent normals can be a better estimator of the means of the random variables, where ‘better’ is measured by squared error). However, I have never seen it in applied work. Clearly I am not well enough read. Are there any classic examples of where James-Stein has improved estimation in an applied setting? If not, is this kind of shrinkage just an intellectual curiosity?
James-Stein estimator is not widely used but it has inspired soft thresholding, hard thresholding which is really widely used.
Wavelet shrinkage estimation (see R package wavethresh) is used a lot in signal processing, shrunken centroid (package pamr under R) for classication is used for DNA micro array, there are a lot of examples of practical efficiency of shrinkage…
For theoretical purpose, see the section of candes’s review about shrinkage estimation (p20-> James stein and the section after after that one deals with soft and hard thresholding):
EDIT from the comments: why is JS shrinkage less used than Soft/hard Thresh ?
James Stein is more difficult to manipulate (practically and theoretically) and to understand intuitively than hard thresholding but the why question is a good question!