Databricks
Helping data teams solve the world’s toughest problems using data and AI
United States of America
Pinned Repositories
click
The "Command Line Interactive Controller for Kubernetes"
dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
jsonnet-style-guide
Databricks Jsonnet Coding Style Guide
koalas
Koalas: pandas API on Apache Spark
learning-spark
Example code from Learning Spark book
LearningSparkV2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
megablocks
scala-style-guide
Databricks Scala Coding Style Guide
spark-deep-learning
Deep Learning Pipelines for Apache Spark
Spark-The-Definitive-Guide
Spark: The Definitive Guide's Code Repository
Databricks's Repositories
databricks/learning-spark
Example code from Learning Spark book
databricks/click
The "Command Line Interactive Controller for Kubernetes"
databricks/spark-redshift
Redshift data source for Apache Spark
databricks/databricks-cli
(Legacy) Command Line Interface for Databricks
databricks/terraform-databricks-lakehouse-blueprints
Set of Terraform automation templates and quickstart demos to jumpstart the design of a Lakehouse on Databricks. This project has incorporated best practices across the industries we work with to deliver composable modules to build a workspace to comply with the highest platform security and governance standards.
databricks/databricks-sql-cli
CLI for querying Databricks SQL
databricks/python-interview
Databricks Python interview setup instructions
databricks/upload-dbfs-temp
databricks/databricks-ttyd
databricks/hive-metastore
Apache Hive Metastore as a Standalone server in Docker
databricks/notebook_gallery
databricks/dbt-tabular
Repository for the dbt ❤️ Tabular blogpost
databricks/docker-dev
Arcion Demo Kit for testing database to database replication
databricks/node-bitdotio
Node SDK for bit.io
databricks/async-file-io
databricks/dockerize
Utility to simplify running applications in docker containers
databricks/arcion-loadgen
databricks/flink-connector-elasticsearch
Apache Flink connector for ElasticSearch
databricks/iceberg-tutorials
databricks/demokit.gtihub.io
Arcion Demo Kit docs
databricks/api-linter
A linter for APIs defined in protocol buffers.
databricks/benchbase
Multi-DBMS SQL Benchmarking Framework via JDBC
databricks/hadoop-thirdparty
Apache Hadoop Thirdparty
databricks/jsonschema
An implementation of the JSON Schema specification for Python
databricks/jsqsh
Console based database query tool, featuring command line editing, piping of output to other programs, and much much more
databricks/llama-hub-llilac
A library of data loaders for LLMs made by the community -- to be used with GPT Index and/or LangChain
databricks/mssql-docker
Official Microsoft repository for SQL Server in Docker resources
databricks/style-guide
A Vale-compatible implementation of the Microsoft Writing Style Guide extended to Tabular.
databricks/superset
Apache Superset is a Data Visualization and Data Exploration Platform
databricks/vagrant
Vagrant Builds