4.1 Introduction

R’s subsetting operators are fast and powerful. Mastering them allows you to succinctly perform complex operations in a way that few other languages can match. Subsetting in R is easy to learn but hard to master because you need to internalise a number of interrelated concepts:

There are six ways to subset atomic vectors.
There are three subsetting operators, [[, [, and $.
Subsetting operators interact differently with different vector types (e.g., atomic vectors, lists, factors, matrices, and data frames).
Subsetting can be combined with assignment.

Subsetting is a natural complement to str(). While str() shows you all the pieces of any object (its structure), subsetting allows you to pull out the pieces that you’re interested in. For large, complex objects, I highly recommend using the interactive RStudio Viewer, which you can activate with View(my_object).

Quiz

Take this short quiz to determine if you need to read this chapter. If the answers quickly come to mind, you can comfortably skip this chapter. Check your answers in Section 4.6.

What is the result of subsetting a vector with positive integers, negative integers, a logical vector, or a character vector?
What’s the difference between [, [[, and $ when applied to a list?
When should you use drop = FALSE?
If x is a matrix, what does x[] <- 0 do? How is it different from x <- 0?
How can you use a named vector to relabel categorical variables?

Outline

Section 4.2 starts by teaching you about [. You’ll learn the six ways to subset atomic vectors. You’ll then learn how those six ways act when used to subset lists, matrices, and data frames.
Section 4.3 expands your knowledge of subsetting operators to include [[ and $ and focuses on the important principles of simplifying versus preserving.
In Section 4.4 you’ll learn the art of subassignment, which combines subsetting and assignment to modify parts of an object.
Section 4.5 leads you through eight important, but not obvious, applications of subsetting to solve problems that you often encounter in data analysis.