If I have a star rating system where users can express their preference for a product or item, how can I detect statistically if the votes are highly “divided”. Meaning, even if the average is 3 out of 5, for a given product, how can I detect if that is a 1-5 split versus a consensus 3, using just the data (no graphical methods)

**Answer**

One could construct a polarization index; exactly how one defines it depends on what constitutes being more polarized (i.e. what, exactly do you mean, in particular edge cases, by more or less polarized?):

For example, if the mean is ‘4’, is a 50-50 split between ‘3’ and ‘5’ more, or less polarized than 25% ‘1’ and 75% ‘5’?

Anyway, in the absence of that kind of specific definition of what you mean, I’ll suggest a measure based off variance:

Given a particular mean, define the most polarized possible split as the one that maximizes variance*.

*(NB that would say that 25% ‘1’ and 75% ‘5’ is substantially *more* polarized than 50-50 split of ‘3’s and ‘5’s; if that doesn’t match your intuition, don’t use variance)

So this polarization index is the proportion of the largest possible variance (*with the observed mean*) in the observed variance.

Call the average rating m (m=\bar x).

The maximum variance occurs when a proportion p=\frac{m-1}{4} is at 5 and 1-p is at 1; this has a variance of

(m-1)(5-m) \cdot \frac{n}{n-1}.

So simply take the sample variance and divide by (m-1)(5-m) \cdot \frac{n}{n-1}; this gives a number between 0 (perfect agreement) and 1 (completely polarized).

For a number of cases where the mean rating is 4, this would give the following:

You might instead prefer *not* to compute them relative to the biggest possible variance with the same mean, but instead as a percentage of the biggest possible variance *for any mean rating*. That would involve dividing instead by 4 \cdot \frac{n}{n-1}, and again yields a value between 0 (perfect agreement) and 1 (polarized at the extremes in a 50-50 ratio). This would yield the same relativities

as the diagram above, but all the values would be 3/4 as large (that is, from left to right, top to bottom they’d be 0, 16.5%, 25%, 25%, 50% and 75%).

Either of the two is a perfectly valid choice – as is any other number of alternative ways of constructing such an index.

**Attribution***Source : Link , Question Author : David Williams , Answer Author : Glen_b*