Practice Examples 3 — SOLUTIONS
Task 0
Load dplyr and the flights data into R.
library(dplyr)
library(nycflights13)
## Error in library(nycflights13): there is no package called 'nycflights13'
data(flights)
## Warning in data(flights): data set 'flights' not found
Look at the description of flights data here
in the flights section.
Task 1
Get the plane tail number (tailnum) of the three planes which arrived with the largest
overall delay (arrival and departure as well) relative to the distance. Try to
accomplish this with one chain.
flights %>%
mutate(
delay = dep_delay + arr_delay,
relative_delay = delay / distance
) %>%
arrange(desc(relative_delay)) %>%
select(tailnum, relative_delay) %>%
head(3)
## Error in eval(lhs, parent, parent): object 'flights' not found
Task 2
From which airport fly the most planes to Boston?
flights %>%
filter(dest == "BOS") %>%
group_by(origin) %>%
summarise(n=n()) %>%
arrange(desc(n))
## Error in eval(lhs, parent, parent): object 'flights' not found
Task 3
Which of the airports is associated with the largest departure delays on average? (Note: be careful with missing values, it is better to exclude them when calculating an average). Is this ordering the same for each month?
flights %>%
group_by(origin) %>%
summarise(delay = mean(dep_delay, na.rm=TRUE)) %>%
arrange(desc(delay))
## Error in eval(lhs, parent, parent): object 'flights' not found
flights %>%
group_by(origin, month) %>%
summarise(delay = mean(dep_delay, na.rm=TRUE)) %>%
group_by(month) %>%
mutate(max_delay = max(delay)) %>%
filter(delay == max_delay) %>%
select(month, origin) %>%
arrange(month)
## Error in eval(lhs, parent, parent): object 'flights' not found