# How to detect polarized user opinions (high and low star ratings)

If I have a star rating system where users can express their preference for a product or item, how can I detect statistically if the votes are highly “divided”. Meaning, even if the average is 3 out of 5, for a given product, how can I detect if that is a 1-5 split versus a consensus 3, using just the data (no graphical methods)

One could construct a polarization index; exactly how one defines it depends on what constitutes being more polarized (i.e. what, exactly do you mean, in particular edge cases, by more or less polarized?):

For example, if the mean is ‘4’, is a 50-50 split between ‘3’ and ‘5’ more, or less polarized than 25% ‘1’ and 75% ‘5’?

Anyway, in the absence of that kind of specific definition of what you mean, I’ll suggest a measure based off variance:

Given a particular mean, define the most polarized possible split as the one that maximizes variance*.

*(NB that would say that 25% ‘1’ and 75% ‘5’ is substantially more polarized than 50-50 split of ‘3’s and ‘5’s; if that doesn’t match your intuition, don’t use variance)

So this polarization index is the proportion of the largest possible variance (with the observed mean) in the observed variance.

Call the average rating $m$ ($m=\bar x$).

The maximum variance occurs when a proportion $p=\frac{m-1}{4}$ is at $5$ and $1-p$ is at $1$; this has a variance of
$(m-1)(5-m) \cdot \frac{n}{n-1}$.

So simply take the sample variance and divide by $(m-1)(5-m) \cdot \frac{n}{n-1}$; this gives a number between $0$ (perfect agreement) and $1$ (completely polarized).

For a number of cases where the mean rating is 4, this would give the following: You might instead prefer not to compute them relative to the biggest possible variance with the same mean, but instead as a percentage of the biggest possible variance for any mean rating. That would involve dividing instead by $4 \cdot \frac{n}{n-1}$, and again yields a value between 0 (perfect agreement) and $1$ (polarized at the extremes in a 50-50 ratio). This would yield the same relativities
as the diagram above, but all the values would be 3/4 as large (that is, from left to right, top to bottom they’d be 0, 16.5%, 25%, 25%, 50% and 75%).

Either of the two is a perfectly valid choice – as is any other number of alternative ways of constructing such an index.