I plan to do a simulation study where I compare the performance of several robust correlation techniques with different distributions (skewed, with outliers, etc.). With robust, I mean the ideal case of being robust against a) skewed distributions, b) outliers, and c) heavy tails.
Along with the Pearson correlation as a baseline, I was thinking to include following more robust measures:
- Spearman’s ρ
- Percentage bend correlation (Wilcox, 1994, )
- Minimum volume ellipsoid, minimum covariance determinant (
- Probably, the winsorized correlation
Of course there are many more options (especially if you include robust regression techniques as well), but I want to restrict myself to the mostly used/ mostly promising approaches.
Now I have three questions (feel free to answer only single ones):
- Are there other robust correlational methods I could/ should include?
- Which robust correlation techniques are actually used in your field?
(Speaking for psychological research: Except Spearman’s ρ, I have never seen any robust correlation technique outside of a technical paper. Bootstrapping is getting more and more popular, but other robust statistics are more or less non-existent so far).
- Are there already systematical comparisons of multiple correlation techniques that you know of?
Also feel free to comment the list of methods given above.
 Wilcox, R. R. (1994). The percentage bend correlation coefficient. Psychometrika, 59, 601-616.
Coming from a psychology perspective, Pearson and Spearman’s correlation do appear to be the most common. However, I think a lot of researchers in psychology engage in various data manipulation procedures on constituent variables prior to performing Pearson’s correlation. I imagine any examination of robustness should consider the effects of:
- transformations of one or both variables in order to make variables approximate a normal distribution
- adjustment or deletion of outliers based on a statistical rule or knowledge of problems with an observation