starrocks: A Java repository from pinterest

Download | Docs | Benchmarks | Demo

StarRocks is the world's fastest open query engine for sub-second, ad-hoc analytics both on and off the data lakehouse. With average query performance 3x faster than other popular alternatives, StarRocks is a query engine that eliminates the need for denormalization and adapts to your use cases, without having to move your data or rewrite SQL. A Linux Foundation project.

Learn more 👉🏻 What Is StarRocks: Features and Use Cases

Features

🚀 Native vectorized SQL engine: StarRocks adopts vectorization technology to make full use of the parallel computing power of CPU, achieving sub-second query returns in multi-dimensional analyses, which is 5 to 10 times faster than previous systems.
📊 Standard SQL: StarRocks supports ANSI SQL syntax (fully supported TPC-H and TPC-DS). It is also compatible with the MySQL protocol. Various clients and BI software can be used to access StarRocks.
💡 Smart query optimization: StarRocks can optimize complex queries through CBO (Cost Based Optimizer). With a better execution plan, the data analysis efficiency will be greatly improved.
⚡ Real-time update: The updated model of StarRocks can perform upsert/delete operations according to the primary key, and achieve efficient query while concurrent updates.
🪟 Intelligent materialized view: The materialized view of StarRocks can be automatically updated during the data import and automatically selected when the query is executed.
✨ Querying data in data lakes directly: StarRocks allows direct access to data from Apache Hive™, Apache Iceberg™, Delta Lake™ and Apache Hudi™ without importing.
🎛️ Resource management: This feature allows StarRocks to limit resource consumption for queries and implement isolation and efficient use of resources among tenants in the same cluster.
💠 Easy to maintain: Simple architecture makes StarRocks easy to deploy, maintain and scale out. StarRocks tunes its query plan agilely, balances the resources when the cluster is scaled in or out, and recovers the data replica under node failure automatically.

Architecture Overview

StarRocks’s streamlined architecture is mainly composed of two modules: Frontend (FE) and Backend (BE). The entire system eliminates single points of failure through seamless and horizontal scaling of FE and BE, as well as replication of metadata and data.

Starting from version 3.0, StarRocks supports a new shared-data architecture, which can provide better scalability and lower costs.

Resources

📚 Read the docs

Section	Description
Quick Starts	How-tos and Tutorials.
Deploy	Learn how to run and configure StarRocks.
Docs	Full documentation.
Blogs	StarRocks deep dive and user stories.

❓ Get support

Slack community: join technical discussions, ask questions, and meet other users!
YouTube channel: subscribe to the latest video tutorials and webcasts.
GitHub issues: report an issue with StarRocks.

Contributing to StarRocks

We welcome all kinds of contributions from the community, individuals and partners. We owe our success to your active involvement.

See Contributing.md to get started.
Set up StarRocks development environment:

Understand our GitHub workflow for opening a pull request; use this PR Template when submitting a pull request.
Pick a good first issue and start contributing.

📝 License: StarRocks is licensed under Apache License 2.0.

👥 Community Membership: Learn more about different contributor roles in StarRocks community.

💬 Developer Group： Please join our Google Groups to discuss StarRocks features, project directions, issues, pull requests, or share suggestions.

Used By

This project is used by the following companies. Learn more about their use cases: