Different people use OOP terms in different ways, so this section provides a quick overview of important vocabulary. The explanations are necessarily compressed, but we will come back to these ideas multiple times.
The main reason to use OOP is polymorphism (literally: many shapes). Polymorphism means that a developer can consider a function’s interface separately from its implementation, making it possible to use the same function form for different types of input. This is closely related to the idea of encapsulation: the user doesn’t need to worry about details of an object because they are encapsulated behind a standard interface.
To be concrete, polymorphism is what allows
summary() to produce different outputs for numeric and factor variables:
ggplot2::diamonds diamonds <- summary(diamonds$carat) #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 0.20 0.40 0.70 0.80 1.04 5.01 summary(diamonds$cut) #> Fair Good Very Good Premium Ideal #> 1610 4906 12082 13791 21551
You could imagine
summary() containing a series of if-else statements, but that would mean only the original author could add new implementations. An OOP system makes it possible for any developer to extend the interface with implementations for new types of input.
To be more precise, OO systems call the type of an object its class, and an implementation for a specific class is called a method. Roughly speaking, a class defines what an object is and methods describe what that object can do. The class defines the fields, the data possessed by every instance of that class. Classes are organised in a hierarchy so that if a method does not exist for one class, its parent’s method is used, and the child is said to inherit behaviour. For example, in R, an ordered factor inherits from a regular factor, and a generalised linear model inherits from a linear model. The process of finding the correct method given a class is called method dispatch.
There are two main paradigms of object-oriented programming which differ in how methods and classes are related. In this book, we’ll borrow the terminology of Extending R (Chambers 2016) and call these paradigms encapsulated and functional:
In encapsulated OOP, methods belong to objects or classes, and method calls typically look like
object.method(arg1, arg2). This is called encapsulated because the object encapsulates both data (with fields) and behaviour (with methods), and is the paradigm found in most popular languages.
In functional OOP, methods belong to generic functions, and method calls look like ordinary function calls:
generic(object, arg2, arg3). This is called functional because from the outside it looks like a regular function call, and internally the components are also functions.
With this terminology in hand, we can now talk precisely about the different OO systems available in R.
Chambers, John M. 2016. Extending R. CRC Press.