The most common continuous position scales are the default
scale_y_continuous() functions. In the simplest case they map linearly from the data value to a location on the plot. There are several other position scales for continuous variables—
scale_x_reverse(), etc—most of which are convenience functions used to provide easy access to common transformations:
ggplot(mpg, aes(displ, hwy)) + geom_point() base <- base+ scale_x_reverse() base + scale_y_reverse()base
For more information on scale transformations see Section 14.2.
9.1.2 Out of bounds values
By default, ggplot2 converts data outside the scale limits to
NA. This means that changing the limits of a scale is not precisely the same as visually zooming in to a region of the plot. If your goal is to zoom in part of the plot, it is better to use the
ylim arguments of
ggplot(mpg, aes(drv, hwy)) + base <- geom_hline(yintercept = 28, colour = "red") + geom_boxplot() base+ coord_cartesian(ylim = c(10, 35)) # zoom only base + ylim(10, 35) # alters the boxplot base #> Warning: Removed 6 rows containing non-finite values (stat_boxplot).
The only difference between the left and middle plots is that that the latter is zoomed in. Some of the outlier points are not shown due to the restriction of the range, but the boxplots themselves remain identical. In contrast, in the plot on the right one of the boxplots has changed. When
ylim() is used to set the scale limits, all observations with highway mileage greater than 35 are converted to
NA before the stat (in this case the boxplot) is computed. This has the effect of shifting the sample median downward. You can learn more about coordinate systems in Section 15.1.
Although the default behaviour is to convert the out of bounds values to
NA, you can override this by setting
oob argument of the scale, a function that is applied to all observations outside the scale limits. The default
scales::censor() which replaces any value outside the limits with
NA. Another option is
scales::squish() which squishes all values into the range. An example using a fill scale is shown below:
data.frame(x = 1:6, y = 8:13) df <- ggplot(df, aes(x, y)) + base <- geom_col(aes(fill = x)) + # bar chart geom_vline(xintercept = 3.5, colour = "red") # for visual clarity only base+ scale_fill_gradient(limits = c(1, 3)) base + scale_fill_gradient(limits = c(1, 3), oob = scales::squish)base
On the left the default fill colours are shown, ranging from dark blue to light blue. In the middle panel the scale limits for the fill aesthetic are reduced so that the values for the three rightmost bars are replace with
NA and are mapped to a grey shade. In some cases this is desired behaviour but often it is not: the right panel addresses this by modifying the
oob function appropriately.
9.1.3 Visual range expansion
If you have eagle eyes, you’ll have noticed that the visual range of the axes actually extends a little bit past the numeric limits that I have specified in the various examples. This ensures that the data does not overlap the axes, which is usually (but not always) desirable.
You can eliminate this this space with
expand = c(0, 0). One scenario where it is usually preferable to remove this space is when using
ggplot(faithfuld, aes(waiting, eruptions)) + geom_raster(aes(fill = density)) + theme(legend.position = "none") ggplot(faithfuld, aes(waiting, eruptions)) + geom_raster(aes(fill = density)) + scale_x_continuous(expand = c(0,0)) + scale_y_continuous(expand = c(0,0)) + theme(legend.position = "none")
The following code creates two plots of the mpg dataset. Modify the code so that the legend and axes match, without using faceting!
subset(mpg, drv == "f") fwd <- subset(mpg, drv == "r") rwd <- ggplot(fwd, aes(displ, hwy, colour = class)) + geom_point() ggplot(rwd, aes(displ, hwy, colour = class)) + geom_point()
What happens if you add two
xlim()calls to the same plot? Why?
scale_x_continuous(limits = c(NA, NA))do?
expand_limits()do and how does it work? Read the source code.
In the examples above, I specified breaks manually, but ggplot2 also allows you to pass a function to
breaks. This function should have one argument that specifies the limits of the scale (a numeric vector of length two), and it should return a numeric vector of breaks. You can write your own break function, but in many cases there is no need, thanks to the scales package.31 It provides several tools that are useful for this purpose:
scales::breaks_extended()creates automatic breaks for numeric axes.
scales::breaks_log()creates breaks appropriate for log axes.
scales::breaks_pretty()creates “pretty” breaks for date/times.
scales::breaks_width()creates equally spaced breaks.
breaks_extended() function is the standard method used in ggplot2, and accordingly the first two plots below are the same. I can alter the desired number of breaks by setting
n = 2, as illustrated in the third plot. Note that
n as a suggestion rather than a strict constraint. If you need to specify exact breaks it is better to do so manually.
data.frame( toy <-const = 1, up = 1:4, txt = letters[1:4], big = (1:4)*1000, log = c(2, 5, 10, 2000) ) toy#> const up txt big log #> 1 1 1 a 1000 2 #> 2 1 2 b 2000 5 #> 3 1 3 c 3000 10 #> 4 1 4 d 4000 2000 ggplot(toy, aes(big, const)) + axs <- geom_point() + labs(x = NULL, y = NULL) axs+ scale_x_continuous(breaks = scales::breaks_extended()) axs + scale_x_continuous(breaks = scales::breaks_extended(n = 2))axs
Another approach that is sometimes useful is specifying a fixed
width that defines the spacing between breaks. The
breaks_width() function is used for this. The first example below shows how to fix the width at a specific value; the second example illustrates the use of the
offset argument that shifts all the breaks by a specified amount:
+ scale_x_continuous(breaks = scales::breaks_width(800)) axs + scale_x_continuous(breaks = scales::breaks_width(800, offset = 200)) axs + scale_x_continuous(breaks = scales::breaks_width(800, offset = -200))axs
Notice the difference between setting an offset of 200 and -200.
You can suppress the breaks entirely by setting them to
+ scale_x_continuous(breaks = NULL)axs
9.1.6 Minor breaks
You can adjust the minor breaks (the unlabelled faint grid lines that appear between the major grid lines) by supplying a numeric vector of positions to the
Minor breaks are particularly useful for log scales because they give a clear visual indicator that the scale is non-linear. To show them off, I’ll first create a vector of minor break values (on the transformed scale), using
%o% to quickly generate a multiplication table and
as.numeric() to flatten the table to a vector.
unique(as.numeric(1:10 %o% 10 ^ (0:3))) mb <- mb#>  1 2 3 4 5 6 7 8 9 10 20 30 #>  40 50 60 70 80 90 100 200 300 400 500 600 #>  700 800 900 1000 2000 3000 4000 5000 6000 7000 8000 9000 #>  10000
The following plots illustrate the effect of setting the minor breaks:
ggplot(toy, aes(log, const)) + geom_point() log_base <- + scale_x_log10() log_base + scale_x_log10(minor_breaks = mb)log_base
breaks, you can also supply a function to
minor_breaks, such as
scales::minor_breaks_width() functions that can be helpful in controlling the minor breaks.
Every break is associated with a label and these can be changed by setting the
labels argument to the scale function:
+ scale_x_continuous(breaks = c(2000, 4000), labels = c("2k", "4k"))axs
In the examples above I specified the vector of
labels manually, but ggplot2 also allows you to pass a labelling function. A function passed to
labels should accept a numeric vector of breaks as input and return a character vector of labels (the same length as the input). The scales package provides a number of tools that will automatically construct label functions for you. Some of the more useful examples for numeric data include:
scales::label_bytes()formats numbers as kilobytes, megabytes etc.
scales::label_comma()formats numbers as decimals with commas added.
scales::label_dollar()formats numbers as currency.
scales::label_ordinal()formats numbers in rank order: 1st, 2nd, 3rd etc.
scales::label_percent()formats numbers as percentages.
scales::label_pvalue()formats numbers as p-values: <.05, <.01, .34, etc.
A few examples are shown below to illustrate how these functions are used:
+ scale_y_continuous(labels = scales::label_percent()) axs + scale_y_continuous(labels = scales::label_dollar(prefix = "", suffix = "€"))axs
You can suppress labels with
labels = NULL. This will remove the labels from the axis or legend while leaving its other properties unchanged:
+ scale_x_continuous(labels = NULL)axs
Recreate the following graphic:
Adjust the y axis label so that the parentheses are the right size.
List the three different types of object you can supply to the
breaksargument. How do
What label function allows you to create mathematical expressions? What label function converts 1 to 1st, 2 to 2nd, and so on?