6.5 Lazy evaluation

In R, function arguments are lazily evaluated: they’re only evaluated if accessed. For example, this code doesn’t generate an error because x is never used:

h01 <- function(x) {
  10
}
h01(stop("This is an error!"))
#> [1] 10

This is an important feature because it allows you to do things like include potentially expensive computations in function arguments that will only be evaluated if needed.

6.5.1 Promises

Lazy evaluation is powered by a data structure called a promise, or (less commonly) a thunk. It’s one of the features that makes R such an interesting programming language (we’ll return to promises again in Section 20.3).

A promise has three components:

  • An expression, like x + y, which gives rise to the delayed computation.

  • An environment where the expression should be evaluated, i.e. the environment where the function is called. This makes sure that the following function returns 11, not 101:

    y <- 10
    h02 <- function(x) {
      y <- 100
      x + 1
    }
    
    h02(y)
    #> [1] 11

    This also means that when you do assignment inside a call to a function, the variable is bound outside of the function, not inside of it.

    h02(y <- 1000)
    #> [1] 1001
    y
    #> [1] 1000
  • A value, which is computed and cached the first time a promise is accessed when the expression is evaluated in the specified environment. This ensures that the promise is evaluated at most once, and is why you only see “Calculating…” printed once in the following example.

    double <- function(x) { 
      message("Calculating...")
      x * 2
    }
    
    h03 <- function(x) {
      c(x, x)
    }
    
    h03(double(20))
    #> Calculating...
    #> [1] 40 40

You cannot manipulate promises with R code. Promises are like a quantum state: any attempt to inspect them with R code will force an immediate evaluation, making the promise disappear. Later, in Section 20.3, you’ll learn about quosures, which convert promises into an R object where you can easily inspect the expression and the environment.

6.5.2 Default arguments

Thanks to lazy evaluation, default values can be defined in terms of other arguments, or even in terms of variables defined later in the function:

h04 <- function(x = 1, y = x * 2, z = a + b) {
  a <- 10
  b <- 100
  
  c(x, y, z)
}

h04()
#> [1]   1   2 110

Many base R functions use this technique, but I don’t recommend it. It makes the code harder to understand: to predict what will be returned, you need to know the exact order in which default arguments are evaluated.

The evaluation environment is slightly different for default and user supplied arguments, as default arguments are evaluated inside the function. This means that seemingly identical calls can yield different results. It’s easiest to see this with an extreme example:

h05 <- function(x = ls()) {
  a <- 1
  x
}

# ls() evaluated inside h05:
h05()
#> [1] "a" "x"

# ls() evaluated in global environment:
h05(ls())
#> [1] "h05"

6.5.3 Missing arguments

To determine if an argument’s value comes from the user or from a default, you can use missing():

h06 <- function(x = 10) {
  list(missing(x), x)
}
str(h06())
#> List of 2
#>  $ : logi TRUE
#>  $ : num 10
str(h06(10))
#> List of 2
#>  $ : logi FALSE
#>  $ : num 10

missing() is best used sparingly, however. Take sample(), for example. How many arguments are required?

args(sample)
#> function (x, size, replace = FALSE, prob = NULL) 
#> NULL

It looks like both x and size are required, but if size is not supplied, sample() uses missing() to provide a default. If I were to rewrite sample, I’d use an explicit NULL to indicate that size is not required but can be supplied:

sample <- function(x, size = NULL, replace = FALSE, prob = NULL) {
  if (is.null(size)) {
    size <- length(x)
  }
  
  x[sample.int(length(x), size, replace = replace, prob = prob)]
}

With the binary pattern created by the %||% infix function, which uses the left side if it’s not NULL and the right side otherwise, we can further simplify sample():

`%||%` <- function(lhs, rhs) {
  if (!is.null(lhs)) {
    lhs
  } else {
    rhs
  }
}

sample <- function(x, size = NULL, replace = FALSE, prob = NULL) {
  size <- size %||% length(x)
  x[sample.int(length(x), size, replace = replace, prob = prob)]
}

Because of lazy evaluation, you don’t need to worry about unnecessary computation: the right side of %||% will only be evaluated if the left side is NULL.

6.5.4 Exercises

  1. What important property of && makes x_ok() work?

    x_ok <- function(x) {
      !is.null(x) && length(x) == 1 && x > 0
    }
    
    x_ok(NULL)
    #> [1] FALSE
    x_ok(1)
    #> [1] TRUE
    x_ok(1:3)
    #> [1] FALSE

    What is different with this code? Why is this behaviour undesirable here?

    x_ok <- function(x) {
      !is.null(x) & length(x) == 1 & x > 0
    }
    
    x_ok(NULL)
    #> logical(0)
    x_ok(1)
    #> [1] TRUE
    x_ok(1:3)
    #> [1] FALSE FALSE FALSE
  2. What does this function return? Why? Which principle does it illustrate?

    f2 <- function(x = z) {
      z <- 100
      x
    }
    f2()
  3. What does this function return? Why? Which principle does it illustrate?

    y <- 10
    f1 <- function(x = {y <- 1; 2}, y = 0) {
      c(x, y)
    }
    f1()
    y
  4. In hist(), the default value of xlim is range(breaks), the default value for breaks is "Sturges", and

    range("Sturges")
    #> [1] "Sturges" "Sturges"

    Explain how hist() works to get a correct xlim value.

  5. Explain why this function works. Why is it confusing?

    show_time <- function(x = stop("Error!")) {
      stop <- function(...) Sys.time()
      print(x)
    }
    show_time()
    #> [1] "2020-09-23 02:49:43 UTC"
  6. How many arguments are required when calling library()?