/databend

𝗔𝗜-𝗡𝗮𝘁𝗶𝘃𝗲 𝗗𝗮𝘁𝗮 𝗪𝗮𝗿𝗲𝗵𝗼𝘂𝘀𝗲. Blazing analytics, fast search, geo insights, vector AI. Built for multimodal analytics, Open-source Snowflake alternative. https://databend.com

Primary LanguageRustOtherNOASSERTION

Databend

ANY DATA. ANY SCALE. ONE DATABASE.

Multimodal data warehouse for the AI era with Snowflake-compatible SQL

☁️ Try Cloud🚀 Quick Start📖 Documentation



slack CI Status Platform

databend

Why Databend?

Multimodal Data Warehouse: Analyze structured, semi-structured, vector, and geospatial data with unified Snowflake-compatible SQL.

AI-Native Platform: Built-in vector search, AI functions, embedding generation, and full-text search - no separate systems needed.

10x Faster & 90% Cost Reduction: Rust-powered vectorized execution with S3-native storage eliminates vendor lock-in and proprietary overhead.

Deploy Anywhere, Connect Everything: 100% open source - run locally with pip install databend, self-host, or use managed cloud clusters. All instances share the same data seamlessly.

Production Proven: Trusted by world-class enterprises managing 800+ petabytes and 100+ million queries daily.

Enterprise Ready: Fine-grained access control, data masking, and audit logging with complete data sovereignty.

Quick Start

Option 1: Databend Cloud Warehouse (Recommended)

Start with Databend Cloud - Serverless warehouse clusters, production-ready in 60 seconds

Option 2: Local Development with Python

pip install databend
import databend

ctx = databend.SessionContext()

# Local table for quick testing
ctx.sql("CREATE TABLE products (id INT, name STRING, price FLOAT)").collect()
ctx.sql("INSERT INTO products VALUES (1, 'Laptop', 1299.99), (2, 'Phone', 899.50)").collect()
ctx.sql("SELECT * FROM products").show()

# S3 remote table (same as cloud warehouse)
ctx.create_s3_connection("s3", "your_key", "your_secret")
ctx.sql("CREATE TABLE sales (id INT, revenue FLOAT) 's3://bucket/sales/' CONNECTION=(connection_name='s3')").collect()
ctx.sql("SELECT COUNT(*) FROM sales").show()

Option 3: Docker (Self-Host Experience)

docker run -p 8000:8000 datafuselabs/databend

Experience the full warehouse capabilities locally - same features as cloud clusters.

Benchmarks

Performance: TPC-H vs Snowflake | ClickBench Results Cost: 90% Cost Reduction

Architecture

Databend Architecture

Multimodal Cloud Warehouse: Production clusters analyze structured, semi-structured, vector, and geospatial data with Snowflake-compatible SQL. Local development environments can attach to the same warehouse data for seamless development.

Use Cases

  • Data Analytics: Snowflake alternative with significant cost reduction
  • AI/ML Pipelines: Vector search and AI functions built-in
  • Real-time Analytics: High-performance queries on petabyte-scale data
  • Data Lake Analytics: Query Parquet, CSV, TSV, NDJSON, Avro, ORC directly from S3

Community

Contributors get immortalized in system.contributors table! 🏆

📄 License

Apache License 2.0 + Elastic License 2.0 Licensing FAQs


Built by engineers who redefine what's possible with data
🌐 Website🐦 Twitter🗺️ Roadmap