17.4 Code can generate code

As well as seeing the tree from code typed by a human, you can also use code to create new trees. There are two main tools: call2() and unquoting.

rlang::call2() constructs a function call from its components: the function to call, and the arguments to call it with.

call2("f", 1, 2, 3)
#> f(1, 2, 3)
call2("+", 1, call2("*", 2, 3))
#> 1 + 2 * 3

call2() is often convenient to program with, but is a bit clunky for interactive use. An alternative technique is to build complex code trees by combining simpler code trees with a template. expr() and enexpr() have built-in support for this idea via !! (pronounced bang-bang), the unquote operator.

The precise details are the topic of Section 19.4, but basically !!x inserts the code tree stored in x into the expression. This makes it easy to build complex trees from simple fragments:

xx <- expr(x + x)
yy <- expr(y + y)

expr(!!xx / !!yy)
#> (x + x)/(y + y)

Notice that the output preserves the operator precedence so we get (x + x) / (y + y) not x + x / y + y (i.e. x + (x / y) + y). This is important, particularly if you’ve been wondering if it wouldn’t be easier to just paste strings together.

Unquoting gets even more useful when you wrap it up into a function, first using enexpr() to capture the user’s expression, then expr() and !! to create a new expression using a template. The example below shows how you can generate an expression that computes the coefficient of variation:

cv <- function(var) {
  var <- enexpr(var)
  expr(sd(!!var) / mean(!!var))
}

cv(x)
#> sd(x)/mean(x)
cv(x + y)
#> sd(x + y)/mean(x + y)

(This isn’t very useful here, but being able to create this sort of building block is very useful when solving more complex problems.)

Importantly, this works even when given weird variable names:

cv(`)`)
#> sd(`)`)/mean(`)`)

Dealing with weird names58 is another good reason to avoid paste() when generating R code. You might think this is an esoteric concern, but not worrying about it when generating SQL code in web applications led to SQL injection attacks that have collectively cost billions of dollars.


  1. More technically, these are called non-syntactic names and are the topic of Section 2.2.1.↩︎