What summary statistics to use with categorical or qualitative variables?

Just to clarify, when I mean summary statistics, I refer to the Mean, Median Quartile ranges, Variance, Standard Deviation.

When summarising a univariate which is categorical or qualitative, considering both Nominal and Ordinal cases, does it make sense to find its mean, median, quartile ranges, variance, and standard deviation?

If so is it different than if you were summarising a continuous variable, and how?


In general, the answer is no. However, one could argue that you can take the median of ordinal data, but you will, of course, have a category as the median, not a number. The median divides the data equally: Half above, half below. Ordinal data depends only on order.

Further, in some cases, the ordinality can be made into rough interval level data. This is true when the ordinal data are grouped (e.g. questions about income are often asked this way). In this case, you can find a precise median, and you may be able to approximate the other values, especially if the lower and upper bounds are specified: You can assume some distribution (e.g. uniform) within each category. Another case of ordinal data that can be made interval is when the levels are given numeric equivalents. For example: Never (0%), sometimes (10-30%), about half the time (50%) and so on.

To (once again) quote David Cox:

There are no routine statistical questions, only questionable
statistical routines

Source : Link , Question Author : chutsu , Answer Author : Peter Flom

Leave a Comment