/Data-Eng--Github-API-ETL-pipeline

Extract data from Github Database via API and upload it on sqlite and bigquery for easy across the team

Primary LanguageJupyter Notebook

Data-Engineering-project

In this project, i extracted data {github name, and repo amount} from Github Database via API and upload it on sqlite and bigquery for easy across the team

This project is a simple demo of data engineering, extracting data from database, create an ETL pipeline and connect the data to a data warehouse(big query)

Tools: Python Github API Pandas GBQ SQL Alchemy

Futher more:

  • Release the full version application, run and monitor the pipeline
  • Integreate google cloud dataprep

Image of bq