1.2 What is the grammar of graphics?
Wilkinson created the grammar of graphics to describe the fundamental features that underlie all statistical graphics. The grammar of graphics answers the a question: what is a statistical graphic? ggplot22 builds on Wilkinson’s grammar by focussing on the primacy of layers and adapting it for use in R. In brief, the grammar tells us that a graphic maps the data to the aesthetic attributes (colour, shape, size) of geometric objects (points, lines, bars). The plot may also include statistical transformations of the data and information about the plot’s coordinate system. Faceting can be used to plot for different subsets of the data. The combination of these independent components are what make up a graphic.
As the book progresses, the formal grammar will be explained in greater detail. The first description of the components follows below. It introduces some of the terminology that will be used throughout the book and outlines the basic function of each component. Don’t worry if it doesn’t make sense right away: you’ll have many more opportunities to learn about the components and how they work together.
All plots are composed of the data, the information you want to visualise, and a mapping, the description of how the data’s variables are mapped to aesthetic attributes. There are five mapping components:
A layer is a collection of geometric elements and statistical transformations. Geometric elements, geoms for short, represent what you actually see in the plot: points, lines, polygons, etc. Statistical transformations, stats for short, summarise the data: for example, binning and counting observations to create a histogram, or fitting a linear model.
Scales map values in the data space to values in the aesthetic space. This includes the use of colour, shape or size. Scales also draw the legend and axes, which make it possible to read the original data values from the plot (an inverse mapping).
A coord, or coordinate system, describes how data coordinates are mapped to the plane of the graphic. It also provides axes and gridlines to help read the graph. We normally use the Cartesian coordinate system, but a number of others are available, including polar coordinates and map projections.
A facet specifies how to break up and display subsets of data as small multiples. This is also known as conditioning or latticing/trellising.
A theme controls the finer points of display, like the font size and background colour. While the defaults in ggplot2 have been chosen with care, you may need to consult other references to create an attractive plot. A good starting place is Tufte’s early works.3
It’s also important to note what the grammar doesn’t do:
It doesn’t suggest which graphics to use. While this book endeavours to promote a sensible process for producing plots, the focus is on how to produce the plots you want, not on which plot to produce. For more advice on choosing or creating plots to answer the question you’re interested in, you may want to consult Naomi Robbins,4 William Cleveland,5 John Chambers et al.6 and John W. Tukey.7
It doesn’t describe interactive graphics, only static ones. There is essentially no difference between displaying ggplot2 graphs on a computer screen and printing them on a piece of paper. For dynamic and interactive graphics, you’ll have to look elsewhere (perhaps at ggvis, described below). Dianne Cook and Deborah F. Swayne8 provides an excellent introduction to the interactive graphics package GGobi. GGobi can be connected to R with the rggobi package.9
Hadley Wickham, “A Layered Grammar of Graphics,” Journal of Computational and Graphical Statistics, 2009.↩︎
Edward R. Tufte, Envisioning Information (Graphics Press, 1990); Edward R. Tufte, Visual Explanations (Graphics Press, 1997); Edward R. Tufte, The Visual Display of Quantitative Information, second (Graphics Press, 2001).↩︎
Creating More Effective Graphs (Chart House, 2013).↩︎
Visualizing Data (Hobart Press, 1993).↩︎
Graphical Methods for Data Analysis (Wadsworth, 1983).↩︎
Exploratory Data Analysis (Addison–Wesley, 1977).↩︎
Interactive and Dynamic Graphics for Data Analysis: With Examples Using R and Ggobi (Springer, 2007).↩︎
Hadley Wickham et al., “An Introduction to Rggobi,” R-News 8, no. 2 (2008): 3–7, http://CRAN.R-project.org/doc/Rnews/Rnews_2008-2.pdf.↩︎