A project that aims to de-congest the national highways by analyzing the road traffic data from different toll plazas. Each highway is operated by a different toll operator with a different IT setup that uses different file formats. Task is to collect data available in different formats and consolidate it into a single file.
In this assignment we will author an Apache Airflow DAG that will:
- Extract data from a CSV file
- Extract data from a TSV file
- Extract data from a fixed width file
- Transform the data
- Load the transformed data into the staging area
This project is described in detail in a Medium. The article provides a comprehensive explanation of the code.
Read the article on Medium: