15.3 Classes

To define an S4 class, call setClass() with three arguments:

  • The class name. By convention, S4 class names use UpperCamelCase.

  • A named character vector that describes the names and classes of the slots (fields). For example, a person might be represented by a character name and a numeric age: c(name = "character", age = "numeric"). The pseudo-class ANY allows a slot to accept objects of any type.

  • A prototype, a list of default values for each slot. Technically, the prototype is optional55, but you should always provide it.

The code below illustrates the three arguments by creating a Person class with character name and numeric age slots.

  slots = c(
    name = "character", 
    age = "numeric"
  prototype = list(
    name = NA_character_,
    age = NA_real_

me <- new("Person", name = "Hadley")
#> Formal class 'Person' [package ".GlobalEnv"] with 2 slots
#>   ..@ name: chr "Hadley"
#>   ..@ age : num NA

15.3.1 Inheritance

There is one other important argument to setClass(): contains. This specifies a class (or classes) to inherit slots and behaviour from. For example, we can create an Employee class that inherits from the Person class, adding an extra slot that describes their boss.

  contains = "Person", 
  slots = c(
    boss = "Person"
  prototype = list(
    boss = new("Person")

#> Formal class 'Employee' [package ".GlobalEnv"] with 3 slots
#>   ..@ boss:Formal class 'Person' [package ".GlobalEnv"] with 2 slots
#>   .. .. ..@ name: chr NA
#>   .. .. ..@ age : num NA
#>   ..@ name: chr NA
#>   ..@ age : num NA

setClass() has 9 other arguments but they are either deprecated or not recommended.

15.3.2 Introspection

To determine what classes an object inherits from, use is():

#> [1] "Person"
#> [1] "Employee" "Person"

To test if an object inherits from a specific class, use the second argument of is():

is(john, "Person")
#> [1] TRUE

15.3.3 Redefinition

In most programming languages, class definition occurs at compile-time and object construction occurs later, at run-time. In R, however, both definition and construction occur at run time. When you call setClass(), you are registering a class definition in a (hidden) global variable. As with all state-modifying functions you need to use setClass() with care. It’s possible to create invalid objects if you redefine a class after already having instantiated an object:

setClass("A", slots = c(x = "numeric"))
a <- new("A", x = 10)

setClass("A", slots = c(a_different_slot = "numeric"))
#> An object of class "A"
#> Slot "a_different_slot":
#> Error in slot(object, what): no slot of name "a_different_slot" for this object
#> of class "A"

This can cause confusion during interactive creation of new classes. (R6 classes have the same problem, as described in Section 14.2.2.)

15.3.4 Helper

new() is a low-level constructor suitable for use by you, the developer. User-facing classes should always be paired with a user-friendly helper. A helper should always:

  • Have the same name as the class, e.g. myclass().

  • Have a thoughtfully crafted user interface with carefully chosen default values and useful conversions.

  • Create carefully crafted error messages tailored towards an end-user.

  • Finish by calling methods::new().

The Person class is so simple so a helper is almost superfluous, but we can use it to clearly define the contract: age is optional but name is required. We’ll also coerce age to a double so the helper also works when passed an integer.

Person <- function(name, age = NA) {
  age <- as.double(age)
  new("Person", name = name, age = age)

#> An object of class "Person"
#> Slot "name":
#> [1] "Hadley"
#> Slot "age":
#> [1] NA

15.3.5 Validator

The constructor automatically checks that the slots have correct classes:

#> Error in validObject(.Object): invalid class "Person" object: invalid object
#> for slot "name" in class "Person": got class "data.frame", should be or extend
#> class "character"

You will need to implement more complicated checks (i.e. checks that involve lengths, or multiple slots) yourself. For example, we might want to make it clear that the Person class is a vector class, and can store data about multiple people. That’s not currently clear because @name and @age can be different lengths:

Person("Hadley", age = c(30, 37))
#> An object of class "Person"
#> Slot "name":
#> [1] "Hadley"
#> Slot "age":
#> [1] 30 37

To enforce these additional constraints we write a validator with setValidity(). It takes a class and a function that returns TRUE if the input is valid, and otherwise returns a character vector describing the problem(s):

setValidity("Person", function(object) {
  if (length(object@name) != length(object@age)) {
    "@name and @age must be same length"
  } else {

Now we can no longer create an invalid object:

Person("Hadley", age = c(30, 37))
#> Error in validObject(.Object): invalid class "Person" object: @name and @age
#> must be same length

NB: The validity method is only called automatically by new(), so you can still create an invalid object by modifying it:

alex <- Person("Alex", age = 30)
alex@age <- 1:10

You can explicitly check the validity yourself by calling validObject():

#> Error in validObject(alex): invalid class "Person" object: @name and @age must
#> be same length

In Section 15.4.4, we’ll use validObject() to create accessors that can not create invalid objects.

15.3.6 Exercises

  1. Extend the Person class with fields to match utils::person(). Think about what slots you will need, what class each slot should have, and what you’ll need to check in your validity method.

  2. What happens if you define a new S4 class that doesn’t have any slots? (Hint: read about virtual classes in ?setClass.)

  3. Imagine you were going to reimplement factors, dates, and data frames in S4. Sketch out the setClass() calls that you would use to define the classes. Think about appropriate slots and prototype.

  1. ?setClass recommends that you avoid the prototype argument, but this is generally considered to be bad advice.↩︎