deltalake

There are 55 repositories under deltalake topic.

  • databricks_delta_table_samples

    This is a code sample repository for demonstrating how to perform Databricks Delta Table operations.

    Language:HTML2
  • automatic-happiness

    A demo repository for integrating a 3rd party data source (e.g. a data platform exposing its data via APIs) to Apache Superset via Deltalake

    Language:Jupyter Notebook1
  • EMR_Studio_Delta_Lake

    Deltalake examples designed to be run on AWS Elastic Map Reduce (EMR) via. EMR Studio or EMR Notebooks

    Language:Jupyter Notebook1
  • ifood-data

    ifood-data

    Ifood data wrangling with Apache Airflow and Apache Spark running on Kubernetes

    Language:Python1
  • delta-lake-dms-cdc

    Example application for DMS CDC with Delta Lake and Apache Hudi

    Language:Python1
  • treinamento-dataproc-deltalake

    Ambiente de treinamento para Dataproc e DeltaLake

    Language:Jupyter Notebook1
  • wideworldadventure

    This repository includes all files that compose the design and unification of the databases AdventureWorks and WideWorldAdventure project.

    Language:Shell
  • rust_nextstep

    A series of exercises to play with more advanced topics in Rust

    Language:Rust
  • glue-docker-image

    A custom Glue Docker image

    Language:Dockerfile
  • Deltalake

    Projeto de engenharia de dados para obtenção de dados, desenvolvimento de um deltalake com o python e análises com o Apache Spark

    Language:Jupyter Notebook
  • flight-ml-preprocess-gcp

    Continuous flight event data processing using Spark Streaming, Delta Lake storage, deployed on GCP dataproc cluster.

    Language:Python
  • Formula1

    Formula1 ADF pipeline

    Language:Python
  • dataops

    Small data pipeline with airflow scheduling

    Language:Jupyter Notebook
  • lambda-delta-optimize

    AWS Lambda function for optimizing Delta tables

    Language:HCL
  • taxacco

    Проект № 4 для курса "Инженер данных".

    Language:Jupyter Notebook
  • Databricks-AWS

    Databricks provides a unified, open platform for all your data. It empowers data scientists, data engineers and data analysts with a simple collaborative environment to run interactive and scheduled data analysis workloads.

    Language:Python
  • datastack-playground

    A datastack playground; includes Spark, Kafka, Airbyte, etc.

    Language:Jupyter Notebook
  • OpenTableFormat.github.io

    Website for open table format 🕸

    Language:CSS
  • Data-Scientist-learning-path-using-databricks

    This is the summary of learning Data Science using Databricks