I have a large set of country data that is crowded (as you can see below), but I need the labels and the outliers – I also have a lot of graphs, so it would be tedious to reset the window and add in false datapoint for the outliers.
Is there a good alternative to a scatterplot that might be better in such a situation? I would really like to do a map, but I need both parts of the ordered pair shown.
A couple techniques are demonstrated in this plot I made a few months ago.
Only label the “interesting” points, and rely on a hover label for identifying other points on demand. This requires human intervention to do well, though software can come close with heuristics such as only showing labels when they can be shown without overlap.
Transform the scale, such as with logs or quantiles. The caution here is that the scale is no longer directly aligned with our perception. The viewer has to keep the transformation in mind.
Use trellising or small multiples. That is, show a series of graphs, each with a subset of the points, such as one graph for each region for your country data.
Use linked single-variable charts, such as bars or dot plots, so that the label is in the axis. It helps if you can sort by either variable interactively.