Databricks
Helping data teams solve the world’s toughest problems using data and AI
United States of America
Pinned Repositories
click
The "Command Line Interactive Controller for Kubernetes"
dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
jsonnet-style-guide
Databricks Jsonnet Coding Style Guide
koalas
Koalas: pandas API on Apache Spark
learning-spark
Example code from Learning Spark book
LearningSparkV2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
megablocks
scala-style-guide
Databricks Scala Coding Style Guide
spark-deep-learning
Deep Learning Pipelines for Apache Spark
Spark-The-Definitive-Guide
Spark: The Definitive Guide's Code Repository
Databricks's Repositories
databricks/LearningSparkV2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
databricks/tensorframes
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark
databricks/devrel
This repository contains the notebooks and presentations we use for our Databricks Tech Talks
databricks/reference-apps
Spark reference applications
databricks/mlops-stacks
This repo provides a customizable stack for starting new ML projects on Databricks that follow production best-practices out of the box.
databricks/spark-xml
XML data source for Spark SQL and DataFrames
databricks/sjsonnet
databricks/iceberg-kafka-connect
databricks/tmm
databricks/containers
Sample base images for Databricks Container Services
databricks/iceberg-rest-image
Simple project to expose a catalog over REST using a Java catalog backend
databricks/notebook-best-practices
An example showing how to apply software engineering best practices to Databricks notebooks.
databricks/tpcds-kit
TPC-DS benchmark kit with some modifications/fixes
databricks/congruity
The goal of this library is to provide a compatibility layer that makes it easier to adopt Spark Connect. The library is designed to be simply imported in your application and will then monkey-patch the existing API to provide the legacy functionality.
databricks/sqltools-databricks-driver
SQLTools driver for Databricks SQL
databricks/databricks-sqlalchemy
See PECO-1396 for more details about this repository.
databricks/databricks-dbutils-scala
The Scala SDK for Databricks.
databricks/databricks-repos-proxy
databricks/rules_docker
Rules for building and handling Docker images with Bazel
databricks/pex
Fork of pantsbuild/pex with a few Databricks-specific changes
databricks/terraform-provider-tabular
databricks/arcion-docs
databricks/bazel
Correct, reproducible, and fast builds for everyone.
databricks/fastar
databricks/vector
A high-performance observability data pipeline.
databricks/cluster-api-provider-gcp-1
The GCP provider implementation for Cluster API
databricks/data-sharing-materialization-orchestration
databricks/data-classification-review-app
databricks/jarjar-abrams
an experimental Scala extension of Jar Jar Links
databricks/thanos-receive-controller
Kubernetes controller to automatically configure Thanos receive hashrings