purrr
Functions for Lesson 6
map
, pluck
, keep
, discard
, compact
Packages for Lesson 6
dplyr
, purrr
Use the purrr
package to apply functions to lists and vectors.
Cheat sheet for the purrr
package.
Recreate the below plot using the smaller NYC Airbnb dataset. The curve is a 'loess'. To change the legend title, add the (unintuitive) colour = "your legend title"
argument to the labs()
function.
# smaller csv file (16 cols)
url <- "http://data.insideairbnb.com/united-states/ny/new-york-city/2021-04-07/data/listings.csv.gz"
nyc <- readr::read_csv(url)
nyc <- nyc[nyc$id < 1e+06, ] # get smaller subet of data
First generate some random data
s1 <- sample(10) # random number sample
s2 <- rnorm(10, 500) # sample 10 normally distributed random numbers around a mean of 500
s3 <- runif(10) # random uniform distribution
s1
[1] 2 4 1 8 10 5 3 6 7 9
s2
[1] 501.6309 498.4540 499.8142 501.0258 499.9049 499.9875 499.9702 502.5944 501.3363 498.4851
s3
[1] 0.8834566 0.1607566 0.8669604 0.1298742 0.4734379 0.9133809 0.2103161 0.3419445 0.2215991
[10] 0.3468892
Now combine these into a list using list()
ls1 <- list(s1, s2, s3) # create a list of these data
ls1
[[1]]
[1] 2 4 1 8 10 5 3 6 7 9
[[2]]
[1] 501.6309 498.4540 499.8142 501.0258 499.9049 499.9875 499.9702 502.5944 501.3363 498.4851
[[3]]
[1] 0.8834566 0.1607566 0.8669604 0.1298742 0.4734379 0.9133809 0.2103161 0.3419445 0.2215991
[10] 0.3468892
ls1 %>% str
List of 3
$ : int [1:10] 2 4 1 8 10 5 3 6 7 9
$ : num [1:10] 502 498 500 501 500 ...
$ : num [1:10] 0.883 0.161 0.867 0.13 0.473 ...
Print the ls1
list object and take note of the index and elements
ls1
# index
ls1[1]
ls1[2]
ls1[3]
# elements
ls1[[1]][[1]]
ls1[[1]][[3]]
ls1[[2]][[10]]
ls1[[3]][[11]] # ??
ls1[[3]] %>% length
# what's the difference?
ls1[1]
ls1[[1]]
The purrr
package uses the following apply functions to apply function iteratively to a list or vector.
map
Apply a function to each element of a list
require(purrr)
set.seed(12) # set a number seed to generate reprodicible results for random data
map(ls1, mean) # get the mean
Apply summary stats to the ls1
list data
* sum
* summary
* max
* sqrt
* length
and lengths
What happens when you run the following and why?
mean(ls1)
sum(ls1)
pluck
Select an element by name or index
keep
Select elements that pass a logical test
discard
Select elements that do not pass a logical test
compact
Drop empty elements
pluck(ls1, 3) # advantage = returns numeric
ls1[3][[1]] # this is the same as above
func <- map(ls1, mean) > 10 # create a logical test (a predicate function)
keep(ls1, func)
discard(ls1, func)
ls2 <- list(1, NA, NULL, integer(0), list()) # list of empty and null things
compact(ls2)
Store plots in lists for easy retrieval. Create two plots of the ls1
data (called ls1p
and ls2p
) and store in a list called plot_list
.
First turn the list into a dataframe so ggplot
understands it.
ls1_df <- ls1 %>% data.frame
names(ls1_df) <- c("A", "B", "C")
plot_list <- list(ls1p, ls2p)
Plot your plot from the object plot_list
every
Do all elements pass a test?
some
Do some elements pass a test?
has_element
Does a list contain an element?
detect
Find first element to pass
detect_index
Find index of first element to pass
vec_depth
Return depth (number of levels of indexes)
ls1 %>% every(is.character)
ls1 %>% some(is.character)
ls1 %>% has_element("foo")
ls1 %>% detect(is.character)
ls1 %>% detect_index(is.character)
ls1 %>% vec_depth
modify
Apply function to each element
modify_at
Apply function to elements by name or index
modify_if
Apply function to elements that pass a test
modify_depth
Apply function to each element at a given level of a list
ls1_repeat <- list(list(list(ls1))) # create list of lists
ls1_repeat %>% map(mean) # list is indexed too far down
ls1_repeat %>% modify_depth(4, mean) # access deep list indices
purrr
functionspmap
Apply a function to groups of elements from lists of lists
lmap
Apply function to each list-element of a list or vector
imap
Apply function to each element of a list or vector and its index