/gr5069-homework-3-lisayokocarle

gr5069-homework-3-lisayokocarle created by GitHub Classroom

Primary LanguagePython

Homework #3

Instructions Answer the following questions using F1 data on the AWS S3 utilizing Databricks. You can use Pandas, R or PySpark.

  1. [10 pts] What was the average time each driver spent at the pit stop for each race?
  2. [20 pts] Rank the average time spent at the pit stop in order of who won each race
  3. [20 pts] Insert the missing code (e.g: ALO for Alonso) for drivers based on the 'drivers' dataset
  4. [20 pts] Who is the youngest and oldest driver for each race? Create a new column called “Age”
  5. [20 pts] For a given race, which driver has the most wins and losses?
  6. [10 pts] Continue exploring the data by answering your own question.

Commit your assignment to your individual Github classroom repo. Your code and git commits should follow the basic principles we discussed so far.

Extra points for using PySpark!