6.5 Lazy evaluation
In R, function arguments are lazily evaluated: they’re only evaluated if accessed. For example, this code doesn’t generate an error because x
is never used:
function(x) {
h01 <-10
}h01(stop("This is an error!"))
#> [1] 10
This is an important feature because it allows you to do things like include potentially expensive computations in function arguments that will only be evaluated if needed.
6.5.1 Promises
Lazy evaluation is powered by a data structure called a promise, or (less commonly) a thunk. It’s one of the features that makes R such an interesting programming language (we’ll return to promises again in Section 20.3).
A promise has three components:
An expression, like
x + y
, which gives rise to the delayed computation.An environment where the expression should be evaluated, i.e. the environment where the function is called. This makes sure that the following function returns 11, not 101:
10 y <- function(x) { h02 <- 100 y <-+ 1 x } h02(y) #> [1] 11
This also means that when you do assignment inside a call to a function, the variable is bound outside of the function, not inside of it.
h02(y <- 1000) #> [1] 1001 y#> [1] 1000
A value, which is computed and cached the first time a promise is accessed when the expression is evaluated in the specified environment. This ensures that the promise is evaluated at most once, and is why you only see “Calculating…” printed once in the following example.
function(x) { double <-message("Calculating...") * 2 x } function(x) { h03 <-c(x, x) } h03(double(20)) #> Calculating... #> [1] 40 40
You cannot manipulate promises with R code. Promises are like a quantum state: any attempt to inspect them with R code will force an immediate evaluation, making the promise disappear. Later, in Section 20.3, you’ll learn about quosures, which convert promises into an R object where you can easily inspect the expression and the environment.
6.5.2 Default arguments
Thanks to lazy evaluation, default values can be defined in terms of other arguments, or even in terms of variables defined later in the function:
function(x = 1, y = x * 2, z = a + b) {
h04 <- 10
a <- 100
b <-
c(x, y, z)
}
h04()
#> [1] 1 2 110
Many base R functions use this technique, but I don’t recommend it. It makes the code harder to understand: to predict what will be returned, you need to know the exact order in which default arguments are evaluated.
The evaluation environment is slightly different for default and user supplied arguments, as default arguments are evaluated inside the function. This means that seemingly identical calls can yield different results. It’s easiest to see this with an extreme example:
function(x = ls()) {
h05 <- 1
a <-
x
}
# ls() evaluated inside h05:
h05()
#> [1] "a" "x"
# ls() evaluated in global environment:
h05(ls())
#> [1] "h05"
6.5.3 Missing arguments
To determine if an argument’s value comes from the user or from a default, you can use missing()
:
function(x = 10) {
h06 <-list(missing(x), x)
}str(h06())
#> List of 2
#> $ : logi TRUE
#> $ : num 10
str(h06(10))
#> List of 2
#> $ : logi FALSE
#> $ : num 10
missing()
is best used sparingly, however. Take sample()
, for example. How many arguments are required?
args(sample)
#> function (x, size, replace = FALSE, prob = NULL)
#> NULL
It looks like both x
and size
are required, but if size
is not supplied, sample()
uses missing()
to provide a default. If I were to rewrite sample, I’d use an explicit NULL
to indicate that size
is not required but can be supplied:
function(x, size = NULL, replace = FALSE, prob = NULL) {
sample <-if (is.null(size)) {
length(x)
size <-
}
sample.int(length(x), size, replace = replace, prob = prob)]
x[ }
With the binary pattern created by the %||%
infix function, which uses the left side if it’s not NULL
and the right side otherwise, we can further simplify sample()
:
`%||%` <- function(lhs, rhs) {
if (!is.null(lhs)) {
lhselse {
}
rhs
}
}
function(x, size = NULL, replace = FALSE, prob = NULL) {
sample <- size %||% length(x)
size <-sample.int(length(x), size, replace = replace, prob = prob)]
x[ }
Because of lazy evaluation, you don’t need to worry about unnecessary computation: the right side of %||%
will only be evaluated if the left side is NULL
.
6.5.4 Exercises
What important property of
&&
makesx_ok()
work?function(x) { x_ok <-!is.null(x) && length(x) == 1 && x > 0 } x_ok(NULL) #> [1] FALSE x_ok(1) #> [1] TRUE x_ok(1:3) #> [1] FALSE
What is different with this code? Why is this behaviour undesirable here?
function(x) { x_ok <-!is.null(x) & length(x) == 1 & x > 0 } x_ok(NULL) #> logical(0) x_ok(1) #> [1] TRUE x_ok(1:3) #> [1] FALSE FALSE FALSE
What does this function return? Why? Which principle does it illustrate?
function(x = z) { f2 <- 100 z <- x }f2()
What does this function return? Why? Which principle does it illustrate?
10 y <- function(x = {y <- 1; 2}, y = 0) { f1 <-c(x, y) }f1() y
In
hist()
, the default value ofxlim
isrange(breaks)
, the default value forbreaks
is"Sturges"
, andrange("Sturges") #> [1] "Sturges" "Sturges"
Explain how
hist()
works to get a correctxlim
value.Explain why this function works. Why is it confusing?
function(x = stop("Error!")) { show_time <- function(...) Sys.time() stop <-print(x) }show_time() #> [1] "2020-09-23 02:49:43 UTC"
How many arguments are required when calling
library()
?