10 Required args shouldn’t have defaults
10.1 What’s the problem?
The absence of a default value should imply than an argument is required; the presence of a default should imply that an argument is optional.
When reading a function, it’s important to be able to tell at a glance which arguments must be supplied and which are optional. Otherwise you need to rely on the user having carefully read the documentation.
10.2 What are some examples?
In
sample()
neitherx
notsize
has a default value, suggesting that both are required, and the function would error if you didn’t supply them. Butsize
is optional, determined by a complex conditional.sample(1:4) #> [1] 2 1 4 3 sample(4) #> [1] 1 4 2 3
rt()
(draw random numbers from the t-distribution) looks like it requires thencp
parameter but it doesn’t.download.file()
looks like it requires themethod
argument but actually consults a global option (download.file.method
) if it’s not supplied.lm()
does not have defaults forformula
,data
,subset
,weights
,na.action
, oroffset
. Onlyformula
is actually required, but even its absence fails to generate a clear error message:lm() #> Error in terms.formula(formula, data = data): argument is not a valid model
help()
andvignette()
have no default for their first argument, suggesting that they’re required. But they’re not: callinghelp()
orvignette()
without any arguments lists all help topics and vignettes respectively.In
diag()
, the argumentx
has a default1
, but it’s required: if you don’t supply it you get an error:diag() #> Error in diag(): argument "nrow" is missing, with no default diag(x = 1) #> [,1] #> [1,] 1
Conversely,
nrow
andncol
don’t have defaults but aren’t required.In
ggplot2::geom_abline()
,slope
andintercept
don’t have defaults but are not required. If you don’t supply them they default toslope = 1
andintercept = 0
, or are taken fromaes()
if they’re provided there.
A common warning sign is the use of missing()
inside the function.
10.3 What are the exceptions?
There are two exceptions to this rule:
A pair of arguments that provide an alternative specification for the same underlying concept. It is only ever possible to supply one argument.
When you can either supply one complex object, or a handful of simpler objects.
In both cases, I believe the benefits outweigh the costs of violating a standard pattern.
10.3.1 Pair of mututally exclusive arguments
A number of functions that allow you to supply exactly one of two possible arguments:
read.table()
allows you to supply data either with a path to afile
, or inline astext
.rvest::html_node()
allows you to select HTML nodes either with acss
selector or anxpath
expression.forcats::fct_other()
allows you to eitherkeep
ordrop
specified factor values.modelr::seq_range()
allows you create a sequence over the range ofx
by either specifying the length of the sequence (withn
) or the distance between values (withby
).
If you use this technique, use xor()
and missing()
to check that exactly one argument is supplied:
if (!xor(missing(keep), missing(drop))) {
stop("Must supply exactly one of `keep` and `drop`", call. = FALSE)
}
And in the documentation, make it clear that only one of the pair can be supplied:
#' @param keep,drop Pick one of `keep` and `drop`:
#' * `keep` will preserve listed levels, replacing all others with
#' `other_level`.
#' * `drop` will replace listed levels with `other_level`, keeping all
#' as is.
This technique should only be used for are exactly two possible arguments. If there are more than two , that is generally a sign you should create more functions. See case studies in Chapter 11 and Section 8.4.1 for examples.
10.3.2 One compound argument vs multiple simple arguments
A related, if less generally useful, form is to allow the user to supply either a single complex argument or several smaller arguments. For example:
stringr::str_sub(x, cbind(start, end))
is equivalent tostr_sub(x, start, end)
.stringr::str_replace_all(x, c(pattern = replacement))
is equivalent tostringr(x, pattern, replacement)
.rgb(cbind(r, g, b))
is equivalent torgb(r, g, b)
(See Chapter 17 for more details).options(list(a = 1, b = 2))
is equivalent tooptions(a = 1, b = 2)
.
The most compelling reason to provide this sort of interface is when another function might return a complex output that you want to use as an input. For example, it seems reasonable that you should be able to feed the output of str_locate()
directly into str_sub()
:
library(stringr)
c("aaaaab", "aaab", "ccccb")
x <- str_locate(x, "a+b")
loc <-
str_sub(x, loc)
#> [1] "aaaaab" "aaab" NA
But equally, it would be weird to have to provide a matrix when subsetting with known positions:
str_sub("Hadley", cbind(2, 4))
#> [1] "adl"
So str_sub()
allows either individual vectors supplied to start
and end
, or a two-colummn matrix supplied to start
.
To implement in your own functions, you should branch on the type of the first argument:
(Why? Why not branch if the other arguments are missing? Or some combination?)
function(string, start = 1L, end = -1L) {
str_sub <-if (is.matrix(start)) {
if (!missing(end)) {
stop("`end` must be missing when `start` is a matrix", call. = FALSE)
}if (ncol(start) != 2) {
stop("Matrix `start` must have exactly two columns", call. = FALSE)
}stri_sub(string, from = start[, 1], to = start[, 2])
else {
} stri_sub(string, from = start, to = end)
} }
And make it clear in the documentation:
#' @param start,end Integer vectors giving the `start` (default: first)
#' and `end` (default: last) positions, inclusively. Alternatively, you
#' pass a two-column matrix to `start`, i.e. `str_sub(x, start, end)`
#' is equivalent to `str_sub(x, cbind(start, end))`