2.6 Unbinding and the garbage collector

Consider this code:

x <- 1:3

x <- 2:4

rm(x)

We created two objects, but by the time the code finishes, neither object is bound to a name. How do these objects get deleted? That’s the job of the garbage collector, or GC for short. The GC frees up memory by deleting R objects that are no longer used, and by requesting more memory from the operating system if needed.

R uses a tracing GC. This means it traces every object that’s reachable from the global9 environment, and all objects that are, in turn, reachable from those objects (i.e. the references in lists and environments are searched recursively). The garbage collector does not use the modify-in-place reference count described above. While these two ideas are closely related, the internal data structures are optimised for different use cases.

The garbage collector (GC) runs automatically whenever R needs more memory to create a new object. Looking from the outside, it’s basically impossible to predict when the GC will run. In fact, you shouldn’t even try. If you want to find out when the GC runs, call gcinfo(TRUE) and GC will print a message to the console every time it runs.

You can force garbage collection by calling gc(). But despite what you might have read elsewhere, there’s never any need to call gc() yourself. The only reasons you might want to call gc() is to ask R to return memory to your operating system so other programs can use it, or for the side-effect that tells you how much memory is currently being used:

gc() 
#>           used (Mb) gc trigger  (Mb) max used  (Mb)
#> Ncells  757832 40.5    1424680  76.1  1424680  76.1
#> Vcells 4815947 36.8   16932186 129.2 16928218 129.2

lobstr::mem_used() is a wrapper around gc() that prints the total number of bytes used:

mem_used()
#> 80,969,112 B

This number won’t agree with the amount of memory reported by your operating system. There are three reasons:

  1. It includes objects created by R but not by the R interpreter.

  2. Both R and the operating system are lazy: they won’t reclaim memory until it’s actually needed. R might be holding on to memory because the OS hasn’t yet asked for it back.

  3. R counts the memory occupied by objects but there may be empty gaps due to deleted objects. This problem is known as memory fragmentation.


  1. And every environment in the current call stack.↩︎