/CDC_Deltalake

Change Data Capture (CDC) gets benefits from Delta Lake

Primary LanguageJupyter Notebook

CDC_Deltalake

Change Data Capture (CDC) gets benefits from Delta Lake Alt text

The purpose of this project is to simulate the ingestion of CSV format files for DeltaLake. Simulating a scenario with Framework Change Data Capture (CDC). The Spark framework was used to capture, process, make available and save the csv file in DeltaLake format. After this flow the CDC Framework correctly recognized the file's modification history and with that we can go through the file's history series going back in time.

Alt text Alt text