Databricks
Helping data teams solve the world’s toughest problems using data and AI
United States of America
Pinned Repositories
click
The "Command Line Interactive Controller for Kubernetes"
dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
jsonnet-style-guide
Databricks Jsonnet Coding Style Guide
koalas
Koalas: pandas API on Apache Spark
learning-spark
Example code from Learning Spark book
LearningSparkV2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
megablocks
scala-style-guide
Databricks Scala Coding Style Guide
spark-deep-learning
Deep Learning Pipelines for Apache Spark
Spark-The-Definitive-Guide
Spark: The Definitive Guide's Code Repository
Databricks's Repositories
databricks/koalas
Koalas: pandas API on Apache Spark
databricks/scala-style-guide
Databricks Scala Coding Style Guide
databricks/dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
databricks/lilac
Curate better data for LLMs
databricks/tensorframes
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark
databricks/reference-apps
Spark reference applications
databricks/spark-xml
XML data source for Spark SQL and DataFrames
databricks/databricks-ml-examples
databricks/notebook-best-practices
An example showing how to apply software engineering best practices to Databricks notebooks.
databricks/pgsqlite
Load sqlite databases into Postgres databases
databricks/tpcds-kit
TPC-DS benchmark kit with some modifications/fixes
databricks/spark-tfocs
A Spark port of TFOCS: Templates for First-Order Conic Solvers (cvxr.com/tfocs)
databricks/databricks-asset-bundles-dais2023
databricks/run-notebook
databricks/sqltools-databricks-driver
SQLTools driver for Databricks SQL
databricks/files_in_repos
databricks/rules_docker
Rules for building and handling Docker images with Bazel
databricks/terraform-provider-tabular
databricks/tabular-sdk-go
Golang SDK for interacting with the Tabular API
databricks/arcion-docs
databricks/vector
A high-performance observability data pipeline.
databricks/cluster-api-provider-aws-1
Kubernetes Cluster API Provider AWS provides consistent deployment and day 2 operations of "self-managed" and EKS Kubernetes clusters on AWS.
databricks/data-sharing-materialization-orchestration
databricks/kdd24-forecasting-anomaly-detection
databricks/buildtools
A bazel BUILD file formatter and editor
databricks/tests-1
Kata Containers tests, CI, and metrics
databricks/cluster-api-1
Home for Cluster API, a subproject of sig-cluster-lifecycle
databricks/data
A repository to stuff data and loader scripts for Tabular demos
databricks/trino-container
A repository to build images of building upcoming Trino features
databricks/YCSB
Yahoo! Cloud Serving Benchmark