The Weather ETL (Extract, Transform, Load) project collects weather data for Cairo, Egypt from OpenWeather API, performs basic transformations, and loads the data into a PostgreSQL database. The ETL process is automated using Apache Airflow.
- Data Extraction: Fetches 3-day weather forecast data for Cairo from OpenWeather API.
- Data Transformation: Converts temperature from Kelvin to Celsius, extracts date and hour from timestamp.
- Data Loading: Loads the transformed data into a PostgreSQL database.
- Automation: Uses Apache Airflow to schedule and manage the ETL process.
- Python 3.7+ - Ensure Python is installed.
- PostgreSQL - Install PostgreSQL and set up a database.
- AirFlow - Install AirFlow and Initialize it with the default configurations.
-
Install Dependencies
git clone https://github.com/1abdelhalim/Weather-ETL.git cd Weather-ETL
-
Clone the Repository
pip install -r requirements.txt
-
Initialize Airflow
airflow db init airflow users create \ --username admin \ --firstname Admin \ --lastname User \ --role Admin \ --email admin@example.com \ --password admin
-
Start Airflow
airflow webserver --port 8080 airflow scheduler
-
Run ETL Manually
python main.py
-
Automated ETL via Airflow
python main.py
The ETL process is scheduled to run daily. You can check the status and manage the DAG through the Airflow web UI at http://localhost:8080.