datalake-ingestion
There are 7 repositories under datalake-ingestion topic.
KeeplerIO/de-identification-framework
Application of our De-identification Framework with open source technologies, enabling enterprises to take ownership of the de-identification process and deploy it in trusted environments.
ac-gomes/data_engineer_with_airflow
Este projeto é uma adaptação com base em um teste real para uma posição de Engenheiro de Dados Jr.
fabricioasn/EnemInData_TCC_Unicarioca
This repository is designed for a data science project aimed to education, wich uses a public database from brazilian educational research institute about the nationam highschool exam and applies ETL and datamining association rules to this dataset.
andyvroberts/smoke
Acquisition of energy industry balancing and settlement calculation data, into a data lake
BurakCakan/gcs-data-ingestion
This repo is designed to show how to read and write data from/to google cloud storage with pyspark. The raw data is ingested, transformed and stored in the data lake in snapshot format.
hannah0wang/end-to-end-data-reporting
End to end data reporting project using Azure services like Azure Data Factory for data orchestration, Azure Synapse Analytics for data warehousing, Databricks for data transformations, and Power BI for intuitive data visualization and reporting.