Practice Examples 3 — SOLUTIONS
Task 0
Load dplyr
and the flights
data into R.
library(dplyr)
library(nycflights13)
## Error in library(nycflights13): there is no package called 'nycflights13'
data(flights)
## Warning in data(flights): data set 'flights' not found
Look at the description of flights
data here
in the flights
section.
Task 1
Get the plane tail number (tailnum
) of the three planes which arrived with the largest
overall delay (arrival and departure as well) relative to the distance. Try to
accomplish this with one chain.
flights %>%
mutate(
delay = dep_delay + arr_delay,
relative_delay = delay / distance
) %>%
arrange(desc(relative_delay)) %>%
select(tailnum, relative_delay) %>%
head(3)
## Error in eval(lhs, parent, parent): object 'flights' not found
Task 2
From which airport fly the most planes to Boston?
flights %>%
filter(dest == "BOS") %>%
group_by(origin) %>%
summarise(n=n()) %>%
arrange(desc(n))
## Error in eval(lhs, parent, parent): object 'flights' not found
Task 3
Which of the airports is associated with the largest departure delays on average? (Note: be careful with missing values, it is better to exclude them when calculating an average). Is this ordering the same for each month?
flights %>%
group_by(origin) %>%
summarise(delay = mean(dep_delay, na.rm=TRUE)) %>%
arrange(desc(delay))
## Error in eval(lhs, parent, parent): object 'flights' not found
flights %>%
group_by(origin, month) %>%
summarise(delay = mean(dep_delay, na.rm=TRUE)) %>%
group_by(month) %>%
mutate(max_delay = max(delay)) %>%
filter(delay == max_delay) %>%
select(month, origin) %>%
arrange(month)
## Error in eval(lhs, parent, parent): object 'flights' not found