/flights-price-evolution

Pet project to get minimum prices from flights

Primary LanguageScala

Flight price evolution

This is a pet project to gather airline prices from their APIs and store minimum prices per route in a time window.

The project has 2 main modules. The following diagram shows the workflow:

Components diagram

  • flights-scraper: gathers the prices from different airlines and sends them to Kafka in a common format.
  • flights-streams: reads the prices from Kafka and filters minimum prices by route for a given window time.

Donwload this repository

git clone https://github.com/d1eg0/flights-price-evolution.git
cd flights-price-evolution

Install flights-scraper module

Install the flights-scraper Python package:

cd flights-scraper
python setup.py install

Docker

docker-compose up --build

Troubleshooting

Clean services and remove MongoDB data:

docker-compose rm -svf
rm -rf .db

and run again docker-compose.

Local

Install Kafka and MongoDB

Configure Kafka and MongoDB

Once the server is up, create the Kafka topic flights:

./kafka-topics.sh --create --bootstrap-server 0.0.0.0:9092 --replication-factor 1 --partitions 1 --topic flights

Create the collection prices in the db flights in MongoDB and the index to make faster updates:

cd scripts
mongo < flights.js

Run scraper and stream processes

Run flights-streams to process incoming prices in 10 minutes windows:

cd flights-streams
sbt run

Running with another time window:

sbt 'run --window-duration "4 hours"'

Run flights-scraper to gather prices:

python -m scraper.run --interval 3600 --origins PMI --destinations BCN MAD VLC

Run tests

flights-scraper:

cd flights-scraper
pytest

flights-streams:

cd flights-streams
sbt test