2.7 Other language engines

A less well-known fact about R Markdown is that many other languages are also supported, such as Python, Julia, C++, and SQL. The support comes from the knitr package, which has provided a large number of language engines. Language engines are essentially functions registered in the object knitr::knit_engine. You can list the names of all available engines via:

names(knitr::knit_engines$get())
##  [1] "awk"         "bash"        "coffee"     
##  [4] "gawk"        "groovy"      "haskell"    
##  [7] "lein"        "mysql"       "node"       
## [10] "octave"      "perl"        "psql"       
## [13] "Rscript"     "ruby"        "sas"        
## [16] "scala"       "sed"         "sh"         
## [19] "stata"       "zsh"         "highlight"  
## [22] "Rcpp"        "tikz"        "dot"        
## [25] "c"           "cc"          "fortran"    
## [28] "fortran95"   "asy"         "cat"        
## [31] "asis"        "stan"        "block"      
## [34] "block2"      "js"          "css"        
## [37] "sql"         "go"          "python"     
## [40] "julia"       "sass"        "scss"       
## [43] "theorem"     "lemma"       "corollary"  
## [46] "proposition" "conjecture"  "definition" 
## [49] "example"     "exercise"    "proof"      
## [52] "remark"      "solution"

Most engines have been documented in Chapter 11 of Xie (2015). The engines from theorem to solution are only available when you use the bookdown package, and the rest are shipped with the knitr package. To use a different language engine, you can change the language name in the chunk header from r to the engine name, e.g.,

```{python}
x = 'hello, python world!'
print(x.split(' '))
```

For engines that rely on external interpreters such as python, perl, and ruby, the default interpreters are obtained from Sys.which(), i.e., using the interpreter found via the environment variable PATH of the system. If you want to use an alternative interpreter, you may specify its path in the chunk option engine.path. For example, you may want to use Python 3 instead of the default Python 2, and we assume Python 3 is at /usr/bin/python3 (may not be true for your system):

```{python, engine.path = '/usr/bin/python3'}
import sys
print(sys.version)
```

You can also change the engine interpreters globally for multiple engines, e.g.,

knitr::opts_chunk$set(engine.path = list(
  python = '~/anaconda/bin/python',
  ruby = '/usr/local/bin/ruby'
))

Note that you can use a named list to specify the paths for different engines.

Most engines will execute each code chunk in a separate new session (via a system() call in R), which means objects created in memory in a previous code chunk will not be directly available to latter code chunks. For example, if you create a variable in a bash code chunk, you will not be able to use it in the next bash code chunk. Currently the only exceptions are r, python, and julia. Only these engines execute code in the same session throughout the document. To clarify, all r code chunks are executed in the same R session, all python code chunks are executed in the same Python session, and so on, but the R session and the Python session are independent.4

I will introduce some specific features and examples for a subset of language engines in knitr below. Note that most chunk options should work for both R and other languages, such as eval and echo, so these options will not be mentioned again.

2.7.1 Python

The python engine is based on the reticulate package (Ushey, Allaire, and Tang 2020), which makes it possible to execute all Python code chunks in the same Python session. If you actually want to execute a certain code chunk in a new Python session, you may use the chunk option python.reticulate = FALSE. If you are using a knitr version lower than 1.18, you should update your R packages.

Below is a relatively simple example that shows how you can create/modify variables, and draw graphics in Python code chunks. Values can be passed to or retrieved from the Python session. To pass a value to Python, assign to py$name, where name is the variable name you want to use in the Python session; to retrieve a value from Python, also use py$name.

---
title: "Python code chunks in R Markdown"
date: 2018-02-22
---

## A normal R code chunk

```{r}
library(reticulate)
x = 42
print(x)
```

## Modify an R variable

In the following chunk, the value of `x` on the right hand side
is `r x`, which was defined in the previous chunk.

```{r}
x = x + 12
print(x)
```

## A Python chunk

This works fine and as expected. 

```{python}
x = 42 * 2
print(x) 
```

The value of `x` in the Python session is `r py$x`.
It is not the same `x` as the one in R.

## Modify a Python variable

```{python}
x = x + 18 
print(x)
```

Retrieve the value of `x` from the Python session again:

```{r}
py$x
```

Assign to a variable in the Python session from R:

```{r}
py$y = 1:5
```

See the value of `y` in the Python session:

```{python}
print(y)
```

## Python graphics

You can draw plots using the **matplotlib** package in Python.

```{python}
import matplotlib.pyplot as plt
plt.plot([0, 2, 1, 4])
plt.show()
```

You may learn more about the reticulate package from https://rstudio.github.io/reticulate/.

2.7.2 Shell scripts

You can also write Shell scripts in R Markdown, if your system can run them (the executable bash or sh should exist). Usually this is not a problem for Linux or macOS users. It is not impossible for Windows users to run Shell scripts, but you will have to install additional software (such as Cygwin or the Linux Subsystem).

```{bash}
echo "Hello Bash!"
cat flights1.csv flights2.csv flights3.csv > flights.csv
```

Shell scripts are executed via the system2() function in R. Basically knitr passes a code chunk to the command bash -c to run it.

2.7.3 SQL

The sql engine uses the DBI package to execute SQL queries, print their results, and optionally assign the results to a data frame.

To use the sql engine, you first need to establish a DBI connection to a database (typically via the DBI::dbConnect() function). You can make use of this connection in a sql chunk via the connection option. For example:

```{r}
library(DBI)
db = dbConnect(RSQLite::SQLite(), dbname = "sql.sqlite")
```

```{sql, connection=db}
SELECT * FROM trials
```

By default, SELECT queries will display the first 10 records of their results within the document. The number of records displayed is controlled by the max.print option, which is in turn derived from the global knitr option sql.max.print (e.g., knitr::opts_knit$set(sql.max.print = 10); N.B. it is opts_knit instead of opts_chunk). For example, the following code chunk displays the first 20 records:

```{sql, connection=db, max.print = 20}
SELECT * FROM trials
```

You can specify no limit on the records to be displayed via max.print = -1 or max.print = NA.

By default, the sql engine includes a caption that indicates the total number of records displayed. You can override this caption using the tab.cap chunk option. For example:

```{sql, connection=db, tab.cap = "My Caption"}
SELECT * FROM trials
```

You can specify that you want no caption all via tab.cap = NA.

If you want to assign the results of the SQL query to an R object as a data frame, you can do this using the output.var option, e.g.,

```{sql, connection=db, output.var="trials"}
SELECT * FROM trials
```

When the results of a SQL query are assigned to a data frame, no records will be printed within the document (if desired, you can manually print the data frame in a subsequent R chunk).

If you need to bind the values of R variables into SQL queries, you can do so by prefacing R variable references with a ?. For example:

```{r}
subjects = 10
```

```{sql, connection=db, output.var="trials"}
SELECT * FROM trials WHERE subjects >= ?subjects
```

If you have many SQL chunks, it may be helpful to set a default for the connection chunk option in the setup chunk, so that it is not necessary to specify the connection on each individual chunk. You can do this as follows:

```{r setup}
library(DBI)
db = dbConnect(RSQLite::SQLite(), dbname = "sql.sqlite")
knitr::opts_chunk$set(connection = "db")
```

Note that the connection option should be a string naming the connection object (not the object itself). Once set, you can execute SQL chunks without specifying an explicit connection:

```{sql}
SELECT * FROM trials
```

2.7.4 Rcpp

The Rcpp engine enables compilation of C++ into R functions via the Rcpp sourceCpp() function. For example:

```{Rcpp}
#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector timesTwo(NumericVector x) {
  return x * 2;
}
```

Executing this chunk will compile the code and make the C++ function timesTwo() available to R.

You can cache the compilation of C++ code chunks using standard knitr caching, i.e., add the cache = TRUE option to the chunk:

```{Rcpp, cache=TRUE}
#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector timesTwo(NumericVector x) {
  return x * 2;
}
```

In some cases, it is desirable to combine all of the Rcpp code chunks in a document into a single compilation unit. This is especially useful when you want to intersperse narrative between pieces of C++ code (e.g., for a tutorial or user guide). It also reduces total compilation time for the document (since there is only a single invocation of the C++ compiler rather than multiple).

To combine all Rcpp chunks into a single compilation unit, you use the ref.label chunk option along with the knitr::all_rcpp_labels() function to collect all of the Rcpp chunks in the document. Here is a simple example:

All C++ code chunks will be combined to the chunk below:

```{Rcpp, ref.label=knitr::all_rcpp_labels(), include=FALSE}
```

First we include the header `Rcpp.h`:

```{Rcpp, eval=FALSE}
#include <Rcpp.h>
```

Then we define a function:

```{Rcpp, eval=FALSE}
// [[Rcpp::export]]
int timesTwo(int x) {
  return x * 2;
}
```

The two Rcpp chunks that include code will be collected and compiled together in the first Rcpp chunk via the ref.label chunk option. Note that we set the eval = FALSE option on the Rcpp chunks with code in them to prevent them from being compiled again.

2.7.5 Stan

The stan engine enables embedding of the Stan probabilistic programming language within R Markdown documents.

The Stan model within the code chunk is compiled into a stanmodel object, and is assigned to a variable with the name given by the output.var option. For example:

```{stan, output.var="ex1"}
parameters {
  real y[2];
}
model {
  y[1] ~ normal(0, 1);
  y[2] ~ double_exponential(0, 2);
}
```

```{r}
library(rstan)
fit = sampling(ex1)
print(fit)
```

2.7.6 JavaScript and CSS

If you are using an R Markdown format that targets HTML output (e.g., html_document and ioslides_presentation, etc.), you can include JavaScript to be executed within the HTML page using the JavaScript engine named js.

For example, the following chunk uses jQuery (which is included in most R Markdown HTML formats) to change the color of the document title to red:

```{js, echo=FALSE}
$('.title').css('color', 'red')
```

Similarly, you can embed CSS rules in the output document. For example, the following code chunk turns text within the document body red:

```{css, echo=FALSE}
body {
  color: red;
}
```

Without the chunk option echo = FALSE, the JavaScript/CSS code will be displayed verbatim in the output document, which is probably not what you want.

2.7.7 Julia

The Julia language is supported through the JuliaCall package (Li 2019). Similar to the python engine, the julia engine runs all Julia code chunks in the same Julia session. Below is a minimal example:

```{julia}
a = sqrt(2);  # the semicolon inhibits printing
```

2.7.8 C and Fortran

For code chunks that use C or Fortran, knitr uses R CMD SHLIB to compile the code, and load the shared object (a *.so file on Unix or *.dll on Windows). Then you can use .C() / .Fortran() to call the C / Fortran functions, e.g.,

```{c, test-c, results='hide'}
void square(double *x) {
  *x = *x * *x;
}
```

Test the `square()` function:

```{r}
.C('square', 9)
.C('square', 123)
```

You can find more examples on different language engines in the GitHub repository https://github.com/yihui/knitr-examples (look for filenames that contain the word “engine”).

References

Li, Changcheng. 2019. JuliaCall: Seamless Integration Between R and Julia. https://github.com/Non-Contradiction/JuliaCall.

Ushey, Kevin, JJ Allaire, and Yuan Tang. 2020. Reticulate: Interface to Python. https://github.com/rstudio/reticulate.

Xie, Yihui. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. https://yihui.name/knitr/.


  1. This is not strictly true, since the Python session is actually launched from R. What I mean here is that you should not expect to use R variables and Python variables interchangeably without explicitly importing/exporting variables between the two sessions.↩︎