6.3 Function composition
Base R provides two ways to compose multiple function calls. For example, imagine you want to compute the population standard deviation using
mean() as building blocks:
function(x) x^2 square <- function(x) x - mean(x)deviation <-
You either nest the function calls:
runif(100) x <- sqrt(mean(square(deviation(x)))) #>  0.274
Or you save the intermediate results as variables:
deviation(x) out <- square(out) out <- mean(out) out <- sqrt(out) out <- out#>  0.274
The magrittr package (Bache and Wickham 2014) provides a third option: the binary operator
%>%, which is called the pipe and is pronounced as “and then”.
library(magrittr) %>% x deviation() %>% square() %>% mean() %>% sqrt() #>  0.274
x %>% f() is equivalent to
x %>% f(y) is equivalent to
f(x, y). The pipe allows you to focus on the high-level composition of functions rather than the low-level flow of data; the focus is on what’s being done (the verbs), rather than on what’s being modified (the nouns). This style is common in Haskell and F#, the main inspiration for magrittr, and is the default style in stack based programming languages like Forth and Factor.
Each of the three options has its own strengths and weaknesses:
f(g(x)), is concise, and well suited for short sequences. But longer sequences are hard to read because they are read inside out and right to left. As a result, arguments can get spread out over long distances creating the Dagwood sandwich problem.
y <- f(x); g(y), requires you to name intermediate objects. This is a strength when objects are important, but a weakness when values are truly intermediate.
x %>% f() %>% g(), allows you to read code in straightforward left-to-right fashion and doesn’t require you to name intermediate objects. But you can only use it with linear sequences of transformations of a single object. It also requires an additional third party package and assumes that the reader understands piping.
Most code will use a combination of all three styles. Piping is more common in data analysis code, as much of an analysis consists of a sequence of transformations of an object (like a data frame or plot). I tend to use piping infrequently in packages; not because it is a bad idea, but because it’s often a less natural fit.
Bache, Stefan Milton, and Hadley Wickham. 2014. Magrittr: A Forward-Pipe Operator for R. http://magrittr.tidyverse.org/.