datalake-ingestion

There are 7 repositories under datalake-ingestion topic.

  • KeeplerIO/de-identification-framework

    Application of our De-identification Framework with open source technologies, enabling enterprises to take ownership of the de-identification process and deploy it in trusted environments.

    Language:Python8402
  • ac-gomes/data_engineer_with_airflow

    Este projeto é uma adaptação com base em um teste real para uma posição de Engenheiro de Dados Jr.

    Language:Python3100
  • anthager/Anton.Pizza

    Language:JavaScript3000
  • fabricioasn/EnemInData_TCC_Unicarioca

    This repository is designed for a data science project aimed to education, wich uses a public database from brazilian educational research institute about the nationam highschool exam and applies ETL and datamining association rules to this dataset.

    21
  • andyvroberts/smoke

    Acquisition of energy industry balancing and settlement calculation data, into a data lake

    Language:C#0100
  • BurakCakan/gcs-data-ingestion

    This repo is designed to show how to read and write data from/to google cloud storage with pyspark. The raw data is ingested, transformed and stored in the data lake in snapshot format.

    Language:Python
  • hannah0wang/end-to-end-data-reporting

    End to end data reporting project using Azure services like Azure Data Factory for data orchestration, Azure Synapse Analytics for data warehousing, Databricks for data transformations, and Power BI for intuitive data visualization and reporting.

    Language:Jupyter Notebook10