For an application, I want to cluster data (potentially high dimensional) and extract probability of belonging to a cluster. I consider at the moment Self organizing maps or kernel k-means to do the job. What are the pros and cons of each classifier for this task? Am-I missing others clustering algorithms that could be performant in this case?
This has the potential to be an interesting question. Clustering algorithms perform ‘well’ or ‘not-well’ depending on the topology of your data and what you are looking for in that data. ¿What do you want the clusters to represent? I attach a diagram which sadly does not include kernel k-means or SOM but I think it is of great value for understanding the grave differences between the techniques. You probably need to ask and respond this to yourself before you dig in to measuring the “pros” and “cons”.
This is the source of the image.