2 Syntax
2.1 Object names
“There are only two hard things in Computer Science: cache invalidation and naming things.”
— Phil Karlton
Variable and function names should use only lowercase letters, numbers, and _
.
Use underscores (_
) (so called snake case) to separate words within a name.
# Good
day_one1
day_
# Bad
DayOne dayone
Base R uses dots in function names (contrib.url()
) and class names
(data.frame
), but it’s better to reserve dots exclusively for the S3 object
system. In S3, methods are given the name function.class
; if you also use
.
in function and class names, you end up with confusing methods like
as.data.frame.data.frame()
.
If you find yourself attempting to cram data into variable names (e.g. model_2018
, model_2019
, model_2020
), consider using a list or data frame instead.
Generally, variable names should be nouns and function names should be verbs. Strive for names that are concise and meaningful (this is not easy!).
# Good
day_one
# Bad
first_day_of_the_month djm1
Where possible, avoid re-using names of common functions and variables. This will cause confusion for the readers of your code.
# Bad
FALSE
T <- 10
c <- function(x) sum(x) mean <-
2.2 Spacing
2.2.1 Commas
Always put a space after a comma, never before, just like in regular English.
# Good
1]
x[,
# Bad
1]
x[,1]
x[ ,1] x[ ,
2.2.2 Parentheses
Do not put spaces inside or outside parentheses for regular function calls.
# Good
mean(x, na.rm = TRUE)
# Bad
mean (x, na.rm = TRUE)
mean( x, na.rm = TRUE )
Place a space before and after ()
when used with if
, for
, or while
.
# Good
if (debug) {
show(x)
}
# Bad
if(debug){
show(x)
}
Place a space after ()
used for function arguments:
# Good
function(x) {}
# Bad
function (x) {}
function(x){}
2.2.3 Embracing
The embracing operator, {{ }}
, should always have inner spaces to help emphasise its special behaviour:
# Good
function(data, var, by) {
max_by <-%>%
data group_by({{ by }}) %>%
summarise(maximum = max({{ var }}, na.rm = TRUE))
}
# Bad
function(data, var, by) {
max_by <-%>%
data group_by({{by}}) %>%
summarise(maximum = max({{var}}, na.rm = TRUE))
}
2.2.4 Infix operators
Most infix operators (==
, +
, -
, <-
, etc.) should always be surrounded by
spaces:
# Good
(feet * 12) + inches
height <-mean(x, na.rm = TRUE)
# Bad
*12+inches
height<-feetmean(x, na.rm=TRUE)
There are a few exceptions, which should never be surrounded by spaces:
The operators with high precedence:
::
,:::
,$
,@
,[
,[[
,^
, unary-
, unary+
, and:
.# Good sqrt(x^2 + y^2) $z df 1:10 x <- # Bad sqrt(x ^ 2 + y ^ 2) $ z df 1 : 10 x <-
Single-sided formulas when the right-hand side is a single identifier:
# Good ~foo tribble( ~col1, ~col2, "a", "b" ) # Bad ~ foo tribble( ~ col1, ~ col2, "a", "b" )
Note that single-sided formulas with a complex right-hand side do need a space:
# Good ~ .x + .y # Bad ~.x + .y
When used in tidy evaluation
!!
(bang-bang) and!!!
(bang-bang-bang) (because have precedence equivalent to unary-
/+
)# Good call(!!xyz) # Bad call(!! xyz) call( !! xyz) call(! !xyz)
The help operator
# Good package?stats ?mean # Bad package ? stats ? mean
2.2.5 Extra spaces
Adding extra spaces ok if it improves alignment of =
or <-
.
# Good
list(
total = a + b + c,
mean = (a + b + c) / n
)
# Also fine
list(
total = a + b + c,
mean = (a + b + c) / n
)
Do not add extra spaces to places where space is not usually allowed.
2.3 Function calls
2.3.1 Named arguments
A function’s arguments typically fall into two broad categories: one supplies the data to compute on; the other controls the details of computation. When you call a function, you typically omit the names of data arguments, because they are used so commonly. If you override the default value of an argument, use the full name:
# Good
mean(1:10, na.rm = TRUE)
# Bad
mean(x = 1:10, , FALSE)
mean(, TRUE, x = c(1:10, NA))
Avoid partial matching.
2.3.2 Assignment
Avoid assignment in function calls:
# Good
complicated_function()
x <-if (nzchar(x) < 1) {
# do something
}
# Bad
if (nzchar(x <- complicated_function()) < 1) {
# do something
}
The only exception is in functions that capture side-effects:
capture.output(x <- f()) output <-
2.4 Control flow
2.4.1 Code blocks
Curly braces, {}
, define the most important hierarchy of R code. To make this
hierarchy easy to see:
{
should be the last character on the line. Related code (e.g., anif
clause, a function declaration, a trailing comma, …) must be on the same line as the opening brace.The contents should be indented by two spaces.
}
should be the first character on the line.
# Good
if (y < 0 && debug) {
message("y is negative")
}
if (y == 0) {
if (x > 0) {
log(x)
else {
} message("x is negative or zero")
}else {
} ^x
y
}
test_that("call1 returns an ordered factor", {
expect_s3_class(call1(x, y), c("factor", "ordered"))
})
tryCatch(
{ scan()
x <-cat("Total: ", sum(x), "\n", sep = "")
},interrupt = function(e) {
message("Aborted by user")
}
)
# Bad
if (y < 0 && debug) {
message("Y is negative")
}
if (y == 0)
{if (x > 0) {
log(x)
else {
} message("x is negative or zero")
}else { y ^ x } }
2.4.2 Inline statements
It’s ok to drop the curly braces for very simple statements that fit on one line, as long as they don’t have side-effects.
# Good
10
y <- if (y < 20) "Too low" else "Too high" x <-
Function calls that affect control flow (like return()
, stop()
or continue
) should always go in their own {}
block:
# Good
if (y < 0) {
stop("Y is negative")
}
function(x) {
find_abs <-if (x > 0) {
return(x)
}* -1
x
}
# Bad
if (y < 0) stop("Y is negative")
if (y < 0)
stop("Y is negative")
function(x) {
find_abs <-if (x > 0) return(x)
* -1
x }
2.4.3 Implicit type coercion
Avoid implicit type coercion (e.g. from numeric to logical) in if
statements:
# Good
if (length(x) > 0) {
# do something
}
# Bad
if (length(x)) {
# do something
}
2.4.4 Switch statements
- Avoid position-based
switch()
statements (i.e. prefer names). - Each element should go on its own line.
- Elements that fall through to the following element should have a space after
=
. - Provide a fall-through error, unless you have previously validated the input.
# Good
switch(x
a = ,
b = 1,
c = 2,
stop("Unknown `x`", call. = FALSE)
)
# Bad
switch(x, a = , b = 1, c = 2)
switch(x, a =, b = 1, c = 2)
switch(y, 1, 2, 3)
2.5 Long lines
Strive to limit your code to 80 characters per line. This fits comfortably on a printed page with a reasonably sized font. If you find yourself running out of room, this is a good indication that you should encapsulate some of the work in a separate function.
If a function call is too long to fit on a single line, use one line each for
the function name, each argument, and the closing )
.
This makes the code easier to read and to change later.
# Good
do_something_very_complicated(
something = "that",
requires = many,
arguments = "some of which may be long"
)
# Bad
do_something_very_complicated("that", requires, many, arguments,
"some of which may be long"
)
As described under [Argument names], you can omit the argument names for very common arguments (i.e. for arguments that are used in almost every invocation of the function). Short unnamed arguments can also go on the same line as the function name, even if the whole function call spans multiple lines.
map(x, f,
extra_argument_a = 10,
extra_argument_b = c(1, 43, 390, 210209)
)
You may also place several arguments on the same line if they are closely
related to each other, e.g., strings in calls to paste()
or stop()
. When
building strings, where possible match one line of code to one line of output.
# Good
paste0(
"Requirement: ", requires, "\n",
"Result: ", result, "\n"
)
# Bad
paste0(
"Requirement: ", requires,
"\n", "Result: ",
"\n") result,
2.6 Semicolons
Don’t put ;
at the end of a line, and don’t use ;
to put multiple commands
on one line.
2.7 Assignment
Use <-
, not =
, for assignment.
# Good
5
x <-
# Bad
5 x =
2.8 Data
2.8.1 Character vectors
Use "
, not '
, for quoting text. The only exception is when the text already
contains double quotes and no single quotes.
# Good
"Text"
'Text with "quotes"'
'<a href="http://style.tidyverse.org">A link</a>'
# Bad
'Text'
'Text with "double" and \'single\' quotes'
2.8.2 Logical vectors
Prefer TRUE
and FALSE
over T
and F
.
2.9 Comments
Each line of a comment should begin with the comment symbol and a single space:
#
In data analysis code, use comments to record important findings and analysis decisions. If you need comments to explain what your code is doing, consider rewriting your code to be clearer. If you discover that you have more comments than code, consider switching to R Markdown.