/crypto-data-pipeline

Data Pipeline for cryptocurrency using Coingecko API

Primary LanguagePython

Cryptocurrency Data Pipeline

Near real time data pipeline for top 100 cryptocurrency coins data using Coingecko API. The data pipeline is made as a function in Google Cloud Function to be executed every 5 minutes through Google Cloud Scheduler trigger.

graph LR
a[Fetch Crypto Data] --> b[Transform Raw Response] 
b --> c[Insert Clean data to BigQuery]
Loading
symbol name current_price market_cap market_cap_rank fully_diluted_valuation total_volume ...
btc Bitcoin 19494.84 374289077997 1 409756957292 23662114103 ...
eth Ethereum 1328.6 160333583155 2 8698392798 ...
usdt Tether 1 68488007086 3 30216495050 ...
bnb BNB 274.46 44807811887 4 45312700938 44544105 ...
... ... ... ... ... ... ... ...

Download sample crypto data

After fetching the cryptocurrency data using the Coingecko API, the script will transform the raw response into a clean data fitting the schema of BigQuery dataset.table. This whole pipeline is simply a Python (3.8) script hosted in Cloud Function to be triggered regularly for every 5 minutes through Cloud Scheduler HTTP request.

*/5 * * * * <link-to-crypto-cloud-function>