19.4 Introducing ggproto

ggplot2 has undergone a couple of rewrites during its long life. A few of these have introduced new class systems to the underlying code. While there is still a small amount of leftover from older class systems, the code has more or less coalesced around the ggproto class system introduced in ggplot2 v2.0.0. ggproto is a custom build class system made specifically for ggplot2 to facilitate portable extension classes. Like the more well-known R6 system it is a system using reference semantics, allowing inheritance and access to methods from parent classes. On top of the ggproto is a set of design principles that, while not enforced by ggproto, is essential to how the system is used in ggplot2.

19.4.1 ggproto syntax

A ggproto object is created using the ggproto() function, which takes a class name, a parent class and a range of fields and methods:

Person <- ggproto("Person", NULL,
  first = "",
  last = "",
  birthdate = NA,
  full_name = function(self) {
    paste(self$first, self$last)
  age = function(self) {
    days_old <- Sys.Date() - self$birthdate
    floor(as.integer(days_old) / 365.25)
  description = function(self) {
    paste(self$full_name(), "is", self$age(), "old")

As can be seen, fields and methods are not differentiated in the construction, and they are not treated differently from a user perspective. Methods can take a first argment self which gives the method access to its own fields and methods, but it won’t be part of the final method signature. One surprising quirk if you come from other reference based object systems in R is that ggproto() does not return a class contructor; it returns an object. New instances of the class is constructed by subclassing the object without giving a new class name:

Me <- ggproto(NULL, Person,
  first = "Thomas Lin",
  last = "Pedersen",
  birthdate = as.Date("1985/10/12")

#> [1] "Thomas Lin Pedersen is 35 old"

When subclassing and overwriting methods, the parent class and its methods are available through the ggproto_parent() function:

Police <- ggproto("Police", Person,
  description = function(self) {
      ggproto_parent(Person, self)$description()

John <- ggproto(NULL, Police,
  first = "John",
  last = "McClane",
  birthdate = as.Date("1955/03/19")

#> [1] "Detective John McClane is 65 old"

For reasons that we’ll discuss below, the use of ggproto_parent() is not that prevalent in the ggplot2 source code.

All in all ggproto is a minimal class system that is designed to accomodate ggplot2 and nothing else. It’s structure is heavily guided by the proto class system used in early versions of ggplot2 in order to reduce the required changes to the ggplot2 source code during the switch, and its features are those required by ggplot2 and nothing more.

19.4.2 ggproto style guide

While ggproto is flexible enough to be used in many ways, it is used in ggplot2 in a very delibarete way. As you are most likely to use ggproto in the context of extending ggplot2 you will need to understand these ways. ggproto classes are used selectively

The use of ggproto in ggplot2 is not all-encompassing. Only select functionality is based on ggproto and it is not expected, nor advised to create new ggproto classes to encapsulate logic in your extensions. This means that you, as an extension developer, will never create ggproto objects from scratch but rather subclass one of the main ggproto classes provided by ggplot2. Later chapters will go into detail on how exactly to do that. ggproto classes are stateless

Except for a few selected internal classes used to orchestrate the rendering, ggproto classes in ggplot2 are stateless. This means that after they are constructed they will not change. This breaks a common expectation for reference based classes where methods will alter the state of the object, but it is paramount that you adhere to this principle. If e.g. some of your Stat or Geom extensions changed state during rendering, plotting a saved ggplot object would affect all instances of that object as all copies would point to the same ggproto objects. State is imposed in two ways in ggplot2. At creation, which is ok because this state should be shared between all instances anyway, and through a params object managed elsewhere. As you’ll see later, most ggproto classes have a setup_params() method where data can be inspected and specific properties calculated and stored. ggproto classes have simple inheritance

Because ggproto class instances are stateless it is relatively safe to call methods from other classes inside a method, instead of inheriting directly from the class. Because of this it is relatively common to borrow functionality from other classes without creating an explicit inheritance. As an example, the setup_params() method in GeomErrorbar is defined as:

GeomErrorbar <- ggproto(
  # ...
  setup_params = function(data, params) {
    GeomLinerange$setup_params(data, params)
  # ...

While we have seen that parent methods can be called using ggproto_parent() this pattern is quite rare to find in the ggplot2 source code, as the pattern shown above is often clearer and just as safe.