6.8 Function forms

To understand computations in R, two slogans are helpful:

  • Everything that exists is an object.
  • Everything that happens is a function call.

— John Chambers

While everything that happens in R is a result of a function call, not all calls look the same. Function calls come in four varieties:

  • prefix: the function name comes before its arguments, like foofy(a, b, c). These constitute of the majority of function calls in R.

  • infix: the function name comes in between its arguments, like x + y. Infix forms are used for many mathematical operators, and for user-defined functions that begin and end with %.

  • replacement: functions that replace values by assignment, like names(df) <- c("a", "b", "c"). They actually look like prefix functions.

  • special: functions like [[, if, and for. While they don’t have a consistent structure, they play important roles in R’s syntax.

While there are four forms, you actually only need one because any call can be written in prefix form. I’ll demonstrate this property, and then you’ll learn about each of the forms in turn.

6.8.1 Rewriting to prefix form

An interesting property of R is that every infix, replacement, or special form can be rewritten in prefix form. Doing so is useful because it helps you better understand the structure of the language, it gives you the real name of every function, and it allows you to modify those functions for fun and profit.

The following example shows three pairs of equivalent calls, rewriting an infix form, replacement form, and a special form into prefix form.

x + y
`+`(x, y)

names(df) <- c("x", "y", "z")
`names<-`(df, c("x", "y", "z"))

for(i in 1:10) print(i)
`for`(i, 1:10, print(i))

Suprisingly, in R, for can be called like a regular function! The same is true for basically every operation in R, which means that knowing the function name of a non-prefix function allows you to override its behaviour. For example, if you’re ever feeling particularly evil, run the following code while a friend is away from their computer. It will introduce a fun bug: 10% of the time, it will add 1 to any numeric calculation inside the parentheses.

`(` <- function(e1) {
  if (is.numeric(e1) && runif(1) < 0.1) {
    e1 + 1
  } else {
    e1
  }
}
replicate(50, (1 + 2))
#>  [1] 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
#> [39] 4 3 4 3 3 3 3 4 3 3 3 3
rm("(")

Of course, overriding built-in functions like this is a bad idea, but, as you’ll learn in Section 21.2.5, it’s possible to apply it only to selected code blocks. This provides a clean and elegant approach to writing domain specific languages and translators to other languages.

A more useful application comes up when using functional programming tools. For example, you could use lapply() to add 3 to every element of a list by first defining a function add():

add <- function(x, y) x + y
lapply(list(1:3, 4:5), add, 3)
#> [[1]]
#> [1] 4 5 6
#> 
#> [[2]]
#> [1] 7 8

But we can also get the same result simply by relying on the existing + function:

lapply(list(1:3, 4:5), `+`, 3)
#> [[1]]
#> [1] 4 5 6
#> 
#> [[2]]
#> [1] 7 8

We’ll explore this idea in detail in Section 9.

6.8.2 Prefix form

The prefix form is the most common form in R code, and indeed in the majority of programming languages. Prefix calls in R are a little special because you can specify arguments in three ways:

  • By position, like help(mean).
  • Using partial matching, like help(top = mean).
  • By name, like help(topic = mean).

As illustrated by the following chunk, arguments are matched by exact name, then with unique prefixes, and finally by position.

k01 <- function(abcdef, bcde1, bcde2) {
  list(a = abcdef, b1 = bcde1, b2 = bcde2)
}
str(k01(1, 2, 3))
#> List of 3
#>  $ a : num 1
#>  $ b1: num 2
#>  $ b2: num 3
str(k01(2, 3, abcdef = 1))
#> List of 3
#>  $ a : num 1
#>  $ b1: num 2
#>  $ b2: num 3

# Can abbreviate long argument names:
str(k01(2, 3, a = 1))
#> List of 3
#>  $ a : num 1
#>  $ b1: num 2
#>  $ b2: num 3

# But this doesn't work because abbreviation is ambiguous
str(k01(1, 3, b = 1))
#> Error in k01(1, 3, b = 1): argument 3 matches multiple formal arguments

In general, use positional matching only for the first one or two arguments; they will be the most commonly used, and most readers will know what they are. Avoid using positional matching for less commonly used arguments, and never use partial matching. Unfortunately you can’t disable partial matching, but you can turn it into a warning with the warnPartialMatchArgs option:

options(warnPartialMatchArgs = TRUE)
x <- k01(a = 1, 2, 3)
#> Warning in k01(a = 1, 2, 3): partial argument match of 'a' to 'abcdef'

6.8.3 Infix functions

Infix functions get their name from the fact the function name comes inbetween its arguments, and hence have two arguments. R comes with a number of built-in infix operators: :, ::, :::, $, @, ^, *, /, +, -, >, >=, <, <=, ==, !=, !, &, &&, |, ||, ~, <-, and <<-. You can also create your own infix functions that start and end with %. Base R uses this pattern to define %%, %*%, %/%, %in%, %o%, and %x%.

Defining your own infix function is simple. You create a two argument function and bind it to a name that starts and ends with %:

`%+%` <- function(a, b) paste0(a, b)
"new " %+% "string"
#> [1] "new string"

The names of infix functions are more flexible than regular R functions: they can contain any sequence of characters except for %. You will need to escape any special characters in the string used to define the function, but not when you call it:

`% %` <- function(a, b) paste(a, b)
`%/\\%` <- function(a, b) paste(a, b)

"a" % % "b"
#> [1] "a b"
"a" %/\% "b"
#> [1] "a b"

R’s default precedence rules mean that infix operators are composed left to right:

`%-%` <- function(a, b) paste0("(", a, " %-% ", b, ")")
"a" %-% "b" %-% "c"
#> [1] "((a %-% b) %-% c)"

There are two special infix functions that can be called with a single argument: + and -.

-1
#> [1] -1
+10
#> [1] 10

6.8.4 Replacement functions

Replacement functions act like they modify their arguments in place, and have the special name xxx<-. They must have arguments named x and value, and must return the modified object. For example, the following function modifies the second element of a vector:

`second<-` <- function(x, value) {
  x[2] <- value
  x
}

Replacement functions are used by placing the function call on the left side of <-:

x <- 1:10
second(x) <- 5L
x
#>  [1]  1  5  3  4  5  6  7  8  9 10

I say they act like they modify their arguments in place, because, as explained in Section 2.5, they actually create a modified copy. We can see that by using tracemem():

x <- 1:10
tracemem(x)
#> <0x7ffae71bd880>

second(x) <- 6L
#> tracemem[0x7ffae71bd880 -> 0x7ffae61b5480]: 
#> tracemem[0x7ffae61b5480 -> 0x7ffae73f0408]: second<- 

If your replacement function needs additional arguments, place them between x and value, and call the replacement function with additional arguments on the left:

`modify<-` <- function(x, position, value) {
  x[position] <- value
  x
}
modify(x, 1) <- 10
x
#>  [1] 10  5  3  4  5  6  7  8  9 10

When you write modify(x, 1) <- 10, behind the scenes R turns it into:

x <- `modify<-`(x, 1, 10)

Combining replacement with other functions requires more complex translation. For example:

x <- c(a = 1, b = 2, c = 3)
names(x)
#> [1] "a" "b" "c"

names(x)[2] <- "two"
names(x)
#> [1] "a"   "two" "c"

is translated into:

`*tmp*` <- x
x <- `names<-`(`*tmp*`, `[<-`(names(`*tmp*`), 2, "two"))
rm(`*tmp*`)

(Yes, it really does create a local variable named *tmp*, which is removed afterwards.)

6.8.5 Special forms

Finally, there are a bunch of language features that are usually written in special ways, but also have prefix forms. These include parentheses:

  • (x) (`(`(x))
  • {x} (`{`(x)).

The subsetting operators:

  • x[i] (`[`(x, i))
  • x[[i]] (`[[`(x, i))

And the tools of control flow:

  • if (cond) true (`if`(cond, true))
  • if (cond) true else false (`if`(cond, true, false))
  • for(var in seq) action (`for`(var, seq, action))
  • while(cond) action (`while`(cond, action))
  • repeat expr (`repeat`(expr))
  • next (`next`())
  • break (`break`())

Finally, the most complex is the function function:

  • function(arg1, arg2) {body} (`function`(alist(arg1, arg2), body, env))

Knowing the name of the function that underlies a special form is useful for getting documentation: ?( is a syntax error; ?`(` will give you the documentation for parentheses.

All special forms are implemented as primitive functions (i.e. in C); this means printing these functions is not informative:

`for`
#> .Primitive("for")

6.8.6 Exercises

  1. Rewrite the following code snippets into prefix form:

    1 + 2 + 3
    
    1 + (2 + 3)
    
    if (length(x) <= 5) x[[5]] else x[[n]]
  2. Clarify the following list of odd function calls:

    x <- sample(replace = TRUE, 20, x = c(1:10, NA))
    y <- runif(min = 0, max = 1, 20)
    cor(m = "k", y = y, u = "p", x = x)
  3. Explain why the following code fails:

    modify(get("x"), 1) <- 10
    #> Error: target of assignment expands to non-language object
  4. Create a replacement function that modifies a random location in a vector.

  5. Write your own version of + that pastes its inputs together if they are character vectors but behaves as usual otherwise. In other words, make this code work:

    1 + 2
    #> [1] 3
    
    "a" + "b"
    #> [1] "ab"
  6. Create a list of all the replacement functions found in the base package. Which ones are primitive functions? (Hint: use apropos().)

  7. What are valid names for user-created infix functions?

  8. Create an infix xor() operator.

  9. Create infix versions of the set functions intersect(), union(), and setdiff(). You might call them %n%, %u%, and %/% to match conventions from mathematics.