9.6 Predicate functionals
A predicate is a function that returns a single TRUE
or FALSE
, like is.character()
, is.null()
, or all()
, and we say a predicate matches a vector if it returns TRUE
.
9.6.1 Basics
A predicate functional applies a predicate to each element of a vector. purrr provides six useful functions which come in three pairs:
some(.x, .p)
returnsTRUE
if any element matches;every(.x, .p)
returnsTRUE
if all elements match.These are similar to
any(map_lgl(.x, .p))
andall(map_lgl(.x, .p))
but they terminate early:some()
returnsTRUE
when it sees the firstTRUE
, andevery()
returnsFALSE
when it sees the firstFALSE
.detect(.x, .p)
returns the value of the first match;detect_index(.x, .p)
returns the location of the first match.keep(.x, .p)
keeps all matching elements;discard(.x, .p)
drops all matching elements.
The following example shows how you might use these functionals with a data frame:
data.frame(x = 1:3, y = c("a", "b", "c"))
df <-detect(df, is.factor)
#> NULL
detect_index(df, is.factor)
#> [1] 0
str(keep(df, is.factor))
#> 'data.frame': 3 obs. of 0 variables
str(discard(df, is.factor))
#> 'data.frame': 3 obs. of 2 variables:
#> $ x: int 1 2 3
#> $ y: chr "a" "b" "c"
9.6.2 Map variants
map()
and modify()
come in variants that also take predicate functions, transforming only the elements of .x
where .p
is TRUE
.
data.frame(
df <-num1 = c(0, 10, 20),
num2 = c(5, 6, 7),
chr1 = c("a", "b", "c"),
stringsAsFactors = FALSE
)
str(map_if(df, is.numeric, mean))
#> List of 3
#> $ num1: num 10
#> $ num2: num 6
#> $ chr1: chr [1:3] "a" "b" "c"
str(modify_if(df, is.numeric, mean))
#> 'data.frame': 3 obs. of 3 variables:
#> $ num1: num 10 10 10
#> $ num2: num 6 6 6
#> $ chr1: chr "a" "b" "c"
str(map(keep(df, is.numeric), mean))
#> List of 2
#> $ num1: num 10
#> $ num2: num 6
9.6.3 Exercises
Why isn’t
is.na()
a predicate function? What base R function is closest to being a predicate version ofis.na()
?simple_reduce()
has a problem whenx
is length 0 or length 1. Describe the source of the problem and how you might go about fixing it.function(x, f) { simple_reduce <- x[[1]] out <-for (i in seq(2, length(x))) { f(out, x[[i]]) out <- } out }
Implement the
span()
function from Haskell: given a listx
and a predicate functionf
,span(x, f)
returns the location of the longest sequential run of elements where the predicate is true. (Hint: you might findrle()
helpful.)Implement
arg_max()
. It should take a function and a vector of inputs, and return the elements of the input where the function returns the highest value. For example,arg_max(-10:5, function(x) x ^ 2)
should return -10.arg_max(-5:5, function(x) x ^ 2)
should returnc(-5, 5)
. Also implement the matchingarg_min()
function.The function below scales a vector so it falls in the range [0, 1]. How would you apply it to every column of a data frame? How would you apply it to every numeric column in a data frame?
function(x) { scale01 <- range(x, na.rm = TRUE) rng <-- rng[1]) / (rng[2] - rng[1]) (x }