7.5 Direct labelling

The Subaru plots above provide examples of “direct labelling”, in which the plot region itself contains the labels for groups of points instead of using a legend. This usually makes the plot easier to read because it puts the labels closer to the data. The broader ggplot2 ecosystem contains a variety of other tools to accomplish this in a more automated fashion. The directlabels package, by Toby Dylan Hocking, provides a number of tools to make this easier:

ggplot(mpg, aes(displ, hwy, colour = class)) + 

  geom_point(show.legend = FALSE) +
  directlabels::geom_dl(aes(label = class), method = "smart.grid")

Directlabels provides a number of position methods. smart.grid is a reasonable place to start for scatterplots, but there are other methods that are more useful for frequency polygons and line plots. See the directlabels website, http://directlabels.r-forge.r-project.org, for other techniques.

Another take on this idea comes from the ggforce package by Thomas Lin Pedersen https://github.com/thomasp85/ggforce. The ggforce package contains a lot of useful tools to extend ggplot2 functionality, including functions such as geom_mark_ellipse() that overlays a plot with circular “highlight” marks. For example:

ggplot(mpg, aes(displ, hwy)) +
  geom_point() + 
  ggforce::geom_mark_ellipse(aes(label = cyl, group = cyl))

A third approach to direct labelling is provided in the gghighlight package by Hiroaki Yutani https://github.com/yutannihilation/gghighlight. In many situations is useful for highlighting points or lines (or indeed a variety of different geoms) within a plot, particularly for longitudinal data:

data(Oxboys, package = "nlme")
ggplot(Oxboys, aes(age, height, group = Subject)) + 
  geom_line() + 
  geom_point() + 
  gghighlight::gghighlight(Subject %in% 1:3)
#> Warning: Tried to calculate with group_by(), but the calculation failed.
#> Falling back to ungrouped filter operation...

#> label_key: Subject