/fuse-query

FuseQuery is a real-time Cloud Distributed Query Engine(Inspired by ClickHouse) at scale

Primary LanguageRustApache License 2.0Apache-2.0

FuseQuery

FuseQuery Lint FuseQuery Unit Tests codecov.io Platform License

FuseQuery is a real-time Cloud Query Engine implemented in Rust.

Inspired by ClickHouse and powered by Arrow.

Features

  • High Performance

    • Everything is Parallelism
  • High Scalability

    • Everything is Distributed
  • High Reliability

    • True Separation of Storage and Compute

Architecture

DataFuse Architecture

Performance

  • Memory SIMD-Vector processing performance only
  • Dataset: 100,000,000,000 (100 Billion)
  • Hardware: AMD Ryzen 7 PRO 4750U, 8 CPU Cores, 16 Threads
  • Rust: rustc 1.49.0 (e1884a8e3 2020-12-29)
  • Build with Link-time Optimization and Using CPU Specific Instructions
  • ClickHouse server version 21.2.1 revision 54447
Query FuseQuery (v0.1) ClickHouse (v21.2.1)
SELECT avg(number) FROM system.numbers_mt (3.11 s.) ×3.14 slow, (9.77 s.)
10.24 billion rows/s., 81.92 GB/s.
SELECT sum(number) FROM system.numbers_mt (2.96 s.) ×2.02 slow, (5.97 s.)
16.75 billion rows/s., 133.97 GB/s.
SELECT min(number) FROM system.numbers_mt (3.57 s.) ×3.90 slow, (13.93 s.)
7.18 billion rows/s., 57.44 GB/s.
SELECT max(number) FROM system.numbers_mt (3.59 s.) ×4.09 slow, (14.70 s.)
6.80 billion rows/s., 54.44 GB/s.
SELECT count(number) FROM system.numbers_mt (1.76 s.) ×2.22 slow, (3.91 s.)
25.58 billion rows/s., 204.65 GB/s.
SELECT sum(number+number+number) FROM numbers_mt (23.14 s.) ×5.47 slow, (126.67 s.)
789.47 million rows/s., 6.32 GB/s.
SELECT sum(number) / count(number) FROM system.numbers_mt (3.09 s.) ×1.96 slow, (6.07 s.)
16.48 billion rows/s., 131.88 GB/s.
SELECT sum(number) / count(number), max(number), min(number) FROM system.numbers_mt (6.73 s.) ×4.01 slow, (27.59 s.)
3.62 billion rows/s., 28.99 GB/s.

Note:

  • ClickHouse system.numbers_mt is 16-way parallelism processing
  • FuseQuery system.numbers_mt is 16-way parallelism processing

Status

General

  • SQL Parser
  • Query Planner
  • Query Optimizer
  • Predicate Push Down
  • Projection Push Down (TODO)
  • Limit Push Down (TODO)
  • Type coercion
  • Parallel Query Execution
  • Distributed Query Execution
  • Sorting (WIP)
  • GroupBy (TODO)
  • Joins (TODO)

SQL Support

  • Projection
  • Filter (WHERE)
  • Limit
  • Aggregate Functions
  • Scalar Functions
  • UDF Functions
  • Sorting (WIP)
  • SubQueries (TOO)
  • Joins (TODO)
  • Window (TODO)

Getting Started

Learn FuseQuery

Try FuseQuery

Roadmap

  • 0.1 Support aggregation select (2021.02)
  • 0.2 Support distributed query (2021.03)
  • 0.3 Support order by
  • 0.5 Support group by
  • 0.6 Support sub queries
  • 0.7 Support join
  • 0.8 Support TPC-H benchmark

Contributing

You can learn more about contributing to the FuseQuery project by reading our Contribution Guide and by viewing our Code of Conduct.

License

FuseQuery is licensed under Apache 2.0.