Databricks
Helping data teams solve the world’s toughest problems using data and AI
United States of America
Pinned Repositories
click
The "Command Line Interactive Controller for Kubernetes"
dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
jsonnet-style-guide
Databricks Jsonnet Coding Style Guide
koalas
Koalas: pandas API on Apache Spark
learning-spark
Example code from Learning Spark book
LearningSparkV2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
megablocks
scala-style-guide
Databricks Scala Coding Style Guide
spark-deep-learning
Deep Learning Pipelines for Apache Spark
Spark-The-Definitive-Guide
Spark: The Definitive Guide's Code Repository
Databricks's Repositories
databricks/pg-text-query
Helpers for generating Postgres queries from text.
databricks/unity-catalog-setup
Notebooks, terraform, tools to enable setting up Unity Catalog
databricks/security-bucket-brigade
databricks/pig-on-spark
proof-of-concept implementation of Pig-on-Spark integrated at the logical node level
databricks/workflows-examples
databricks/public-data
A directory of Public Data Sources on bit.io
databricks/cloud-latency-map
databricks/python-bitdotio
Python SDK for bit.io
databricks/terraform-databricks-mlops-aws-project
This module creates and configures service principals with appropriate permissions and entitlements to run CI/CD for a project, and creates a workspace directory as a container for project-specific resources for the Databricks AWS staging and prod workspaces.
databricks/terraform-databricks-mlops-aws-infrastructure
This module sets up multi-workspace model registry between a Databricks AWS development (dev) workspace, staging workspace, and production (prod) workspace, allowing READ access from dev/staging workspaces to staging & prod model registries.
databricks/terraform-databricks-mlops-azure-project-with-sp-creation
This module creates and configures service principals with appropriate permissions and entitlements to run CI/CD for a project, and creates a workspace directory as a container for project-specific resources for the Azure Databricks staging and prod workspaces. It also creates the relevant Azure Active Directory (AAD) applications for the service principals.
databricks/iceberg-aws-ext
databricks/mfg_dlt_workshop
DLT Manufacturing Workshop
databricks/lumin
Repository for code related to Lumin migration
databricks/bitdotio-csharp-example
An example of using C# with bit.io
databricks/bitdotio-golang-example
Example of using golang with bit.io
databricks/terraform-databricks-mlops-azure-infrastructure-with-sp-creation
This module sets up multi-workspace model registry between an Azure Databricks development (dev) workspace, staging workspace, and production (prod) workspace, allowing READ access from dev/staging workspaces to staging & prod model registries. It also creates the relevant Azure Active Directory (AAD) applications for the service principals.
databricks/terraform-databricks-mlops-azure-infrastructure-with-sp-linking
This module sets up multi-workspace model registry between an Azure Databricks development (dev) workspace, staging workspace, and production (prod) workspace, allowing READ access from dev/staging workspaces to staging & prod model registries. It also links pre-existing Azure Active Directory (AAD) applications to the service principals.
databricks/terraform-databricks-mlops-azure-project-with-sp-linking
This module creates and configures service principals with appropriate permissions and entitlements to run CI/CD for a project, and creates a workspace directory as a container for project-specific resources for the Azure Databricks staging and prod workspaces. It also links pre-existing Azure Active Directory (AAD) applications to the service principals.
databricks/bitdotio-java-example
An example of using Java & JDBC with bit.io
databricks/brew
🍺 The missing package manager for macOS (or Linux)
databricks/btrace
BTrace - a safe, dynamic tracing tool for the Java platform
databricks/cloudflare-b2-proxy
Proxy Backblaze S3 compatible API requests, optionally sending notifications to a webhook
databricks/consent-manager-bitio
databricks/m3db-operator
Kubernetes operator for M3DB
databricks/pg_query_go
Go library to parse and normalize SQL queries using the PostgreSQL query parser
databricks/pglast
PostgreSQL Languages AST and statements prettifier: master branch covers PG10, v2 branch covers PG12
databricks/postgres-range
Range data type parser and serializer for PostgreSQL
databricks/tailscale
The easiest, most secure way to use WireGuard and 2FA.
databricks/upickle
uPickle: a simple, fast, dependency-free JSON & Binary (MessagePack) serialization library for Scala