Introduction
In the following five chapters you’ll learn about object-oriented programming (OOP). OOP is a little more challenging in R than in other languages because:
There are multiple OOP systems to choose from. In this book, I’ll focus on the three that I believe are most important: S3, R6, and S4. S3 and S4 are provided by base R. R6 is provided by the R6 package, and is similar to the Reference Classes, or RC for short, from base R.
There is disagreement about the relative importance of the OOP systems. I think S3 is most important, followed by R6, then S4. Others believe that S4 is most important, followed by RC, and that S3 should be avoided. This means that different R communities use different systems.
S3 and S4 use generic function OOP which is rather different from the encapsulated OOP used by most languages popular today44. We’ll come back to precisely what those terms mean shortly, but basically, while the underlying ideas of OOP are the same across languages, their expressions are rather different. This means that you can’t immediately transfer your existing OOP skills to R.
Generally in R, functional programming is much more important than object-oriented programming, because you typically solve complex problems by decomposing them into simple functions, not simple objects. Nevertheless, there are important reasons to learn each of the three systems:
S3 allows your functions to return rich results with user-friendly display and programmer-friendly internals. S3 is used throughout base R, so it’s important to master if you want to extend base R functions to work with new types of input.
R6 provides a standardised way to escape R’s copy-on-modify semantics. This is particularly important if you want to model objects that exist independently of R. Today, a common need for R6 is to model data that comes from a web API, and where changes come from inside or outside of R.
S4 is a rigorous system that forces you to think carefully about program design. It’s particularly well-suited for building large systems that evolve over time and will receive contributions from many programmers. This is why it is used by the Bioconductor project, so another reason to learn S4 is to equip you to contribute to that project.
The goal of this brief introductory chapter is to give you some important vocabulary and some tools to identify OOP systems in the wild. The following chapters then dive into the details of R’s OOP systems:
Chapter 12 details the base types which form the foundation underlying all other OO system.
Chapter 13 introduces S3, the simplest and most commonly used OO system.
Chapter 14 discusses R6, a encapsulated OO system built on top of environments.
Chapter 15 introduces S4, which is similar to S3 but more formal and more strict.
Chapter 16 compares these three main OO systems. By understanding the trade-offs of each system you can appreciate when to use one or the other.
This book focusses on the mechanics of OOP, not its effective use, and it may be challenging to fully understand if you have not done object-oriented programming before. You might wonder why I chose not to provide more immediately useful coverage. I have focused on mechanics here because they need to be well described somewhere (writing these chapters required a considerable amount of reading, exploration, and synthesis on my behalf), and using OOP effectively is sufficiently complex to require a book-length treatment; there’s simply not enough room in Advanced R to cover it in the depth required.
The exception is Julia, which also uses generic function OOP. Compared to R, Julia’s implementation is fully developed and extremely performant.↩︎