Flight Data Analysis

Overview

Flight Data Analysis project is focused on querying and extracting insights from flight and passenger datasets. Some operations used include aggregation, filtering, mapping and reducing. The project aims to provide a better understanding of the datasets, with a focus on functional programming methodology to reduce computational load and improve efficiency.

Tools

This sbt project is based in Scala. Spark API is heavily used for data processing and data manipulation of the data.

  • Scala version: 2.13.8
  • Spark version: 3.2.1

More details on the dependencies used can be found in the build.sbt file.

Instructions

  1. Install JDK.
  2. Install scala and spark (homebrew).
  3. On IntelliJ (or any other IDE), open the project as a sbt project. Build should start automatically.
  4. Run the main class which contains the main function.
  5. Test functions in the MainTest.scala file.