It seems that for K-means and other related algorithms, clustering is based off calculating distance between points. Is there one that works without it?

**Answer**

One example of such a method are Finite Mixture Models (e.g. here or here) used for clustering. In FMM you consider the distribution ($f$) of your variable $X$ as a mixture of $K$ distributions ($f_1,…,f_k$):

$$f(x, \vartheta) = \sum^K_{k=1} \pi_k f_k(x, \vartheta_k)$$

where $\vartheta$ is a vector of parameters $\vartheta = (\pi’, \vartheta_1′, …, \vartheta_k’)’$ and $\pi_k$ is a proportion of $k$‘th distribution in the mixture and $\vartheta_k$ is a parameter (or parameters) of $f_k$ distribution.

A specific case for discrete data is Latent Class Analysis (e.g. Vermunt and Magidson, 2003) defined as:

$$P(x, k) = P(k) P(x|k)$$

where $P(k)$ is probability of observing latent class $k$ (i.e. $\pi_k$), $P(x)$ is probability of observing an $x$ value and $P(x|k)$ is probability of $x$ being in class $k$.

Usually for both FMM and LCA EM algorithm is used for estimation, but Bayesian approach is also possible, but a little bit more demanding because of problems such as model identification and label switching (e.g. Xi’an’s blog).

So there is no distance measure but rather a statistical model defining the structure (distribution) of your data. Because of that other name of this method is “model-based clustering”.

Check the two books on FMM:

- McLachlan, G. & Peel, D. (2000).
*Finite Mixture Models.*John Wiley & Sons. - Frühwirth-Schnatter, S. (2006).
*Finite Mixture and Markov Switching Models.*Springer.

One of the most popular clustering packages that uses FMM is `mclust`

(check here or here) that is implemented in R. However, more complicated FMM’s are also possible, check for example `flexmix`

package and it’s documentation. For LCA there is an R poLCA package.

**Attribution***Source : Link , Question Author : user154510 , Answer Author : Tim*