FuseQuery is a Cloud Distributed SQL Query Engine at scale.
Distributed ClickHouse from scratch in Rust.
Give thanks to ClickHouse and Arrow.
-
High Performance
- Everything is Parallelism
-
High Scalability
- Everything is Distributed
-
High Reliability
- True Separation of Storage and Compute
Crate | Description | Status |
---|---|---|
distributed | Distributed scheduler and executor for planner | WIP |
optimizers | Optimizer for Distributed&Local plan | WIP |
datablocks | Vectorized data processing unit | WIP |
datastreams | Async streaming iterators | WIP |
datasources | Interface to the datasource(system.numbers for performance/Fuse-Store) | WIP |
executors | Executor(EXPLAIN/SELECT) for the Pipeline | WIP |
functions | Scalar and Aggregation Functions | WIP |
processors | Dataflow Streaming Processor | WIP |
planners | Distributed&Local planners for building processor pipelines | WIP |
servers | Server handler(MySQL/HTTP) | MySQL |
transforms | Data Stream Transform(Source/Filter/Projection/AggregatorPartial/AggregatorFinal/Limit) | WIP |
- Projection
- Filter
- Limit
- Aggregate
- Functions
- Filter Push-Down
- Projection Push-Down
- Distributed Query
- Sorting
- Joins
- SubQueries
- Memory SIMD-Vector processing performance only
- Dataset: 10,000,000,000 (10 Billion)
- Hardware: 8vCPUx16G Cloud Instance
- Rust: rustc 1.50.0-nightly (f76ecd066 2020-12-15)
Query | FuseQuery Cost | ClickHouse Cost |
---|---|---|
SELECT avg(number) FROM system.numbers_mt | [2.02s] | [1.70s], 5.90 billion rows/s., 47.16 GB/s |
SELECT sum(number) FROM system.numbers_mt | [1.77s] | [1.34s], 7.48 billion rows/s., 59.80 GB/s |
SELECT max(number) FROM system.numbers_mt | [2.83s] | [2.33s], 4.34 billion rows/s., 34.74 GB/s |
SELECT max(number+1) FROM system.numbers_mt | [6.13s] | [3.29s], 3.04 billion rows/s., 24.31 GB/s |
SELECT count(number) FROM system.numbers_mt | [1.55s] | [0.67s], 15.00 billion rows/s., 119.99 GB/s |
SELECT sum(number) / count(number) FROM system.numbers_mt | [2.04s] | [1.28s], 7.84 billion rows/s., 62.73 GB/s |
SELECT sum(number) / count(number), max(number), min(number) FROM system.numbers_mt | [6.40s] | [4.30s], 2.33 billion rows/s., 18.61 GB/s |
Note:
- ClickHouse system.numbers_mt is 8-way parallelism processing
- FuseQuery system.numbers_mt is 8-way parallelism processing
Run from source
$ make run
12:46:15 [ INFO] Options { log_level: "debug", num_cpus: 8, mysql_handler_port: 3307 }
12:46:15 [ INFO] Fuse-Query Cloud Compute Starts...
12:46:15 [ INFO] Usage: mysql -h127.0.0.1 -P3307
or Run with docker(Recommended):
$ docker pull datafusedev/fuse-query
...
$ docker run --init --rm -p 3307:3307 datafusedev/fuse-query
05:12:36 [ INFO] Options { log_level: "debug", num_cpus: 6, mysql_handler_port: 3307 }
05:12:36 [ INFO] Fuse-Query Cloud Compute Starts...
05:12:36 [ INFO] Usage: mysql -h127.0.0.1 -P3307
$ mysql -h127.0.0.1 -P3307
mysql> explain select (number+1) as c1, number/2 as c2 from system.numbers_mt(10000000) where (c1+c2+1) < 100 limit 3;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| └─ Limit: 3
└─ Projection: (number + 1) as c1:UInt64, (number / 2) as c2:UInt64
└─ Filter: (((c1 + c2) + 1) < 100)
└─ ReadDataSource: scan parts [8](Read from system.numbers_mt table) |
|
└─ LimitTransform × 1 processor
└─ Merge (LimitTransform × 8 processors) to (MergeProcessor × 1)
└─ LimitTransform × 8 processors
└─ ProjectionTransform × 8 processors
└─ FilterTransform × 8 processors
└─ SourceTransform × 8 processors |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
mysql> select (number+1) as c1, number/2 as c2 from system.numbers_mt(10000000) where (c1+c2+1) < 100 limit 3;
+------+------+
| c1 | c2 |
+------+------+
| 1 | 0 |
| 2 | 0 |
| 3 | 1 |
+------+------+
3 rows in set (0.06 sec)
$ make test