22.5 Non-interactive debugging

Debugging is most challenging when you can’t run code interactively, typically because it’s part of some pipeline run automatically (possibly on another computer), or because the error doesn’t occur when you run same code interactively. This can be extremely frustrating!

This section will give you some useful tools, but don’t forget the general strategy in Section 22.2. When you can’t explore interactively, it’s particularly important to spend some time making the problem as small as possible so you can iterate quickly. Sometimes callr::r(f, list(1, 2)) can be useful; this calls f(1, 2) in a fresh session, and can help to reproduce the problem.

You might also want to double check for these common issues:

Is the global environment different? Have you loaded different packages? Are objects left from previous sessions causing differences?
Is the working directory different?
Is the PATH environment variable, which determines where external commands (like git) are found, different?
Is the R_LIBS environment variable, which determines where library() looks for packages, different?

22.5.1 `dump.frames()`

dump.frames() is the equivalent to recover() for non-interactive code; it saves a last.dump.rda file in the working directory. Later, an interactive session, you can load("last.dump.rda"); debugger() to enter an interactive debugger with the same interface as recover(). This lets you “cheat”, interactively debugging code that was run non-interactively.

# In batch R process ----
dump_and_quit <- function() {
  # Save debugging info to file last.dump.rda
  dump.frames(to.file = TRUE)
  # Quit R with error status
  q(status = 1)
}
options(error = dump_and_quit)

# In a later interactive session ----
load("last.dump.rda")
debugger()

22.5.2 Print debugging

If dump.frames() doesn’t help, a good fallback is print debugging, where you insert numerous print statements to precisely locate the problem, and see the values of important variables. Print debugging is slow and primitive, but it always works, so it’s particularly useful if you can’t get a good traceback. Start by inserting coarse-grained markers, and then make them progressively more fine-grained as you determine exactly where the problem is.

f <- function(a) {
  cat("f()\n")
  g(a)
}
g <- function(b) {
  cat("g()\n")
  cat("b =", b, "\n")
  h(b)
}
h <- function(c) {
  cat("i()\n")
  i(c)
}

f(10)
#> f()
#> g()
#> b = 10 
#> i()
#> [1] 20

Print debugging is particularly useful for compiled code because it’s not uncommon for the compiler to modify your code to such an extent you can’t figure out the root problem even when inside an interactive debugger.

22.5.3 RMarkdown

Debugging code inside RMarkdown files requires some special tools. First, if you’re knitting the file using RStudio, switch to calling rmarkdown::render("path/to/file.Rmd") instead. This runs the code in the current session, which makes it easier to debug. If doing this makes the problem go away, you’ll need to figure out what makes the environments different.

If the problem persists, you’ll need to use your interactive debugging skills. Whatever method you use, you’ll need an extra step: in the error handler, you’ll need to call sink(). This removes the default sink that knitr uses to capture all output, and ensures that you can see the results in the console. For example, to use recover() with RMarkdown, you’d put the following code in your setup block:

options(error = function() {
  sink()
  recover()
})

This will generate a “no sink to remove” warning when knitr completes; you can safely ignore this warning.

If you simply want a traceback, the easiest option is to use rlang::trace_back(), taking advantage of the rlang_trace_top_env option. This ensures that you only see the traceback from your code, instead of all the functions called by RMarkdown and knitr.

options(rlang_trace_top_env = rlang::current_env())
options(error = function() {
  sink()
  print(rlang::trace_back(bottom = sys.frame(-1)), simplify = "none")
})