I have recently learned about HDBSCAN (a fairly new method for clustering, not yet available in scikitlearn) and am really surprised at how good it is. The following picture illustrates that the predecessor of HDBSCAN – DBSCAN – is already the only algorithm that performs perfectly on a sample of different clustering tasks:
With HDBSCAN, you do not even need to set the distance parameter of DBSCAN, making it even more intuitive. I have tried it out on a few custom clustering tasks myself, and it always performed better than any other algorithm I have tried so far.
So my question is: Except for computation time, where kmeans is still superior to all, is there any case were kmeans might be superior? Highdimensional data for example, or a weird combination of clusters? I honestly can’t really think of anything…
Answer

Randomization can be valuable. You can run kmeans several times to get different possible clusters, as not all may be good. With HDBSCAN, you will always get the same result again.

Classifier: kmeans yields an obvious and fast nearestcenter classifier to predict the label for new objects. Correctly labeling new objects in HDBSCAN isn’t obvious

No noise. Many users don’t (want to) know how to handle noise in their data. Kmeans gives a very simple and easy to understand result: every object belongs to exactly one cluster. With HDBSCAN, objects can belong to 0 clusters, and clusters are actually a tree and not flat.

Performance and approximation. If you have a huge dataset, you can just take a random sample for kmeans, and statistics says you’ll get almost the same result. For HDBSCAN, it’s not clear how to use it only with a subset of the data.
But don’t get me wrong. IMHO kmeans is very limited, hard to use, and often badly used on inappropriate problems and data. I do admire the HDBSCAN algorithm (and the original DBSCAN and OPTICS). On Geo data, these just work a thousand times better than kmeans. Kmeans is totally overused (because too many classes do not teach anything except kmeans), and minibatch kmeans is the worst version of kmeans, it does not make sense to use it when your data fits into memory (hence it should be removed from sklearn IMHO).
Attribution
Source : Link , Question Author : Thomas , Answer Author : Has QUIT–AnonyMousse