e01n0's Stars
unitycatalog/unitycatalog
Open, Multi-modal Catalog for Data & AI
modin-project/modin
Modin: Scale your Pandas workflows by changing a single line of code
databrickslabs/ucx
Automated migrations to Unity Catalog
MrPowers/mack
Delta Lake helper methods in PySpark
databricks/mlops-stacks
This repo provides a customizable stack for starting new ML projects on Databricks that follow production best-practices out of the box.
Nike-Inc/brickflow
Pythonic Programming Framework to orchestrate jobs in Databricks Workflow
NewDay-Data/scilint
🧐 scilint: infuse quality into notebook based workflows with a new type of build tool
opentofu/manifesto
The OpenTF Manifesto expresses concern over HashiCorp's switch of the Terraform license from open-source to the Business Source License (BSL) and calls for the tool's return to a truly open-source license.
anawinter53/Learnify
Group project at La Fosse Academy creating a quiz/flashcard website
sparkutils/quality
A Quality Spark DQ Library
souvik-databricks/dlt-with-debug
A lightweight helper utility which allows developers to do interactive pipeline development by having a unified source code for both DLT run and Non-DLT interactive notebook run.
databrickslabs/overwatch
Capture deep metrics on one or all assets within a Databricks workspace
thestr4ng3r/chiaki
Moved to https://git.sr.ht/~thestr4ng3r/chiaki - Free and Open Source PS4 Remote Play Client
harupy/dbvim
Enable Vim on Databricks
cube2222/octosql
OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.
tobymao/sqlglot
Python SQL Parser and Transpiler
Gousto/alaspark
ÀLaSpark: Gousto's recipe for building PySpark pipelines at scale
orchest/orchest
Build data pipelines, the easy way 🛠️
kwai/blaze
Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
CoxAutomotiveDataSolutions/waimak
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
MrPowers/chispa
PySpark test helper methods with beautiful error messages
wnkz/aws-sso
Command Line tool for AWS SSO Credentials
louisguitton/dbt-metadata-utils
Parse dbt artifacts and search dbt models with Algolia
PRQL/prql
PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
sachaos/todoist
Todoist CLI Client. I ❤️ Todoist and CLI.
delta-io/delta-rs
A native Rust library for Delta Lake, with bindings into Python
markthebault/pii-check-spark
Simple PII data check using PySpark
weld-project/weld
High-performance runtime for data analytics applications
roapi/roapi
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
rajasekarv/vega
A new arguably faster implementation of Apache Spark from scratch in Rust