7.6 As data structures

As well as powering scoping, environments are also useful data structures in their own right because they have reference semantics. There are three common problems that they can help solve:

  • Avoiding copies of large data. Since environments have reference semantics, you’ll never accidentally create a copy. But bare environments are painful to work with, so instead I recommend using R6 objects, which are built on top of environments. Learn more in Chapter 14.

  • Managing state within a package. Explicit environments are useful in packages because they allow you to maintain state across function calls. Normally, objects in a package are locked, so you can’t modify them directly. Instead, you can do something like this:

    my_env <- new.env(parent = emptyenv())
    my_env$a <- 1
    
    get_a <- function() {
      my_env$a
    }
    set_a <- function(value) {
      old <- my_env$a
      my_env$a <- value
      invisible(old)
    }

    Returning the old value from setter functions is a good pattern because it makes it easier to reset the previous value in conjunction with on.exit() (Section 6.7.4).

  • As a hashmap. A hashmap is a data structure that takes constant, O(1), time to find an object based on its name. Environments provide this behaviour by default, so can be used to simulate a hashmap. See the hash package (Brown 2013) for a complete development of this idea.

References

Brown, Christopher. 2013. Hash: Full Feature Implementation of Hash/Associated Arrays/Dictionaries. https://CRAN.R-project.org/package=hash.