I’ve done some clustering, and now I want to visualise the relationships with some features. Ideally I want to create a chord diagram like the image below (source):
The chord graph basically shows the relationships between data from a matrix. E.g. in the image above, one observes that around 50% of the patients with a cough come from sub-phenotype 1 (which is one of the 3 clusters). This diagram is especially useful to quickly provide an overview of the different clusters and how the clusters are characterized (i.e. by which features).
However, this is currently not practical in Python since there is no library that supports this (see here) with the numbers around the circle. Are there any other visualisations that offer the same information, but are inherently totally different? I’ve searched for similar visualisations but could not find anything that offers the same information visually.
I agree with @Nick Cox. This figure is pretty, but doesn’t seem very good to me except as eye candy. In essence, this is a Sankey plot (a.k.a., river plot or flow diagram) with just two levels where the ends have been bent into semicircles. If you’re married to that, I would use a Sankey plot where the ends have not been bent into semicircles for easier readability. You can see an example of a Sankey plot (in R) in my answer to Chart suggestions for data flow. Apparently these can be made in Python using matplotlib.
However, I think you would do better to use a mosaic plot or a biplot from a correspondence analysis. I have an example of a mosaic plot in my answer to What’s the best way to visualize the effects of categories & their prevalence in logistic regression?, and an example of plotting the results of a correspondence analysis in my answer to Which is the best visualization for contingency tables? Both mosaic plots and correspondence analyses can also be plotted with Python.