Practice Examples 3 — SOLUTIONS

Task 0

Load dplyr and the flights data into R.

library(dplyr)
library(nycflights13)
## Error in library(nycflights13): there is no package called 'nycflights13'
data(flights)
## Warning in data(flights): data set 'flights' not found

Look at the description of flights data here in the flights section.

Task 1

Get the plane tail number (tailnum) of the three planes which arrived with the largest overall delay (arrival and departure as well) relative to the distance. Try to accomplish this with one chain.

flights %>%
    mutate(
        delay = dep_delay + arr_delay,
        relative_delay = delay / distance
    ) %>%
    arrange(desc(relative_delay)) %>%
    select(tailnum, relative_delay) %>%
    head(3)
## Error in eval(lhs, parent, parent): object 'flights' not found

Task 2

From which airport fly the most planes to Boston?

flights %>%
    filter(dest == "BOS") %>%
    group_by(origin) %>%
    summarise(n=n()) %>%
    arrange(desc(n))
## Error in eval(lhs, parent, parent): object 'flights' not found

Task 3

Which of the airports is associated with the largest departure delays on average? (Note: be careful with missing values, it is better to exclude them when calculating an average). Is this ordering the same for each month?

flights %>%
    group_by(origin) %>%
    summarise(delay = mean(dep_delay, na.rm=TRUE)) %>%
    arrange(desc(delay))
## Error in eval(lhs, parent, parent): object 'flights' not found
flights %>%
    group_by(origin, month) %>%
    summarise(delay = mean(dep_delay, na.rm=TRUE)) %>%
    group_by(month) %>%
    mutate(max_delay = max(delay)) %>%
    filter(delay == max_delay) %>%
    select(month, origin) %>%
    arrange(month)
## Error in eval(lhs, parent, parent): object 'flights' not found