Creating-Streaming-Data-Pipelines-using-Airflow

Scenario

A project that aims to de-congest the national highways by analyzing the road traffic data from different toll plazas. Each highway is operated by a different toll operator with a different IT setup that uses different file formats. Task is to collect data available in different formats and consolidate it into a single file.

Objectives

In this assignment we will author an Apache Airflow DAG that will:

  • Extract data from a CSV file
  • Extract data from a TSV file
  • Extract data from a fixed width file
  • Transform the data
  • Load the transformed data into the staging area

Description

This project is described in detail in a Medium. The article provides a comprehensive explanation of the code.

Read the article on Medium: