Roadmap 2023
BohuTANG opened this issue · 9 comments
After a full year of research and development in 2022, the functionality and stability of Databend were significantly enhanced, and several users began using it in production. Databend has helped them greatly reduce costs and operational complexity issues.
This is Databend Roadmap in 2023 (discussion).
See also:
Main tasks
v1.3
v1.2 (Prepare for release on May 15th)
v1.1 (Prepare for release on April 5th)
v1.0 (Prepare for release on March 5th)
Features
Task | Status | Comments |
---|---|---|
Update#9261 | DONE | need optimized(release in v1.0) |
Privileges | DONE | |
Alter table | DONE | high-priority(release in v1.0 ) |
Window function#6342 | DONE | |
Lambda function and high-order functions | DONE | |
Materialized view | Aggregating index | DONE |
Support SET_VAR hints#8833 | DONE | |
Parquet reader | DONE | |
DataFrame | DONE | |
Data Sharing(community version) | DONE | |
Concurrent query enhance | IN PROGRESS | |
Distributed COPY#8594 | DONE | |
Support Decimal data type#2931 | DONE | high-priority(release in v1.0 ) |
Add Column-Level dynamic data masking support | PLAN |
Improvements
Task | Status | Comments |
---|---|---|
New expression#9411 | DONE | |
Error message | PLAN |
Planner
Task | Status | Comments |
---|---|---|
Scalar expression normalization | DONE | |
Column constraint framework | DONE | |
Functional dependency framework#7438 | DONE | |
Join reorder | DONE | |
CBO | DONE | high-priority(release in v1.0) |
Support TPC-DS | DONE | |
Support optimization tracing | PLAN | Easy to debug/study. |
Cache
Task | Status | Comments |
---|---|---|
Unified cache layer | DONE | |
Meta data cache | DONE | |
Index data cache | DONE | |
Block data cache | DONE | high-priority(release in v1.0 ) |
Data Storage
Task | Status | Comments |
---|---|---|
Fuse engine re-clustering | DONE | high-priority(release in v1.1) |
Fuse engine orphan data cleanup | DONE | high-priority(release in v1.0) |
Distributed Query Execution
Task | Status | Comments |
---|---|---|
Visualized profiling | IN PROGRESS | |
Aggregation spilling | DONE | high-priority(release in v1.1) |
Resource Quota
Task | Status | Comments |
---|---|---|
Session-level quota control (CPU/Memory) | DONE |
Schema-Less Search
Task | Status | Comments |
---|---|---|
JSON indexing | DONE | high-priority |
Fulltext index#3915 | IN PROGRESS | high-priority |
Array functions#7931 | DONE | high-priority |
Faiss index#9699 | PLAN |
LakeHouse
Task | Status | Comments |
---|---|---|
Apache Hive | DONE | |
Apache Iceberg | DONE | |
Delta Lake | PLAN | |
Querying external storage(Parquet) | DONE |
Integrations
Task | Status | Comments |
---|---|---|
Dbt integration | DONE | |
Airbyte integration | DONE | |
Datadog Vector integrate with Rust-driver | DONE | |
Datax integrate with Java-driver | DONE | |
CDC with Flink | DONE | |
CDC with Kafka | DONE |
Meta
Task | Status | Comments |
---|---|---|
Jepsen test | DONE | |
Store membership in raft | DONE | |
Nonblocking snapshot building | DONE | |
Snapshot file format impl | DONE | |
Upgrade on-disk store format | DONE |
Testing
Task | Status | Comments |
---|---|---|
SQLlogic Test | DONE | Supports more test cases |
SQLancer Test | DONE | Supports more type and more cases |
Fuzzer Test | IN PROGRESS |
Releases
any plan about improving concurrency capabilities? so developers can depend on databend to make some data exploring platforms (like google analystics?) on the web.
any plan about tuning the metasrv's memory usage? I've got a OOM last week, IMHO it can store most the data in the disk?
any plan about improving concurrency capabilities? so developers can depend on databend to make some data exploring platforms (like google analystics?) on the web.
Added: Concurrent query enhance
any plan about tuning the metasrv's memory usage? I've got a OOM last week, IMHO it can store most the data in the disk?
@drmingdrmer will fill the meta section, I think he will do it.
Any plan to support decimal
data type? This is essential If we want to use databend in financial related fields. Will we see it in the first half of the year?
Any plan to support
decimal
data type? This is essential If we want to use databend in financial related fields. Will we see it in the first half of the year?
Added to the main task, thanks.
will fault tolerance on query processing be planned in 2023?
likewise I have some spot instances, the cluster may handles a shutdowned instance gracefully and not affect the running queries.
will fault tolerance on query processing be planned in 2023?
Will do but hard to do, so the priority is low.
likewise I have some spot instances, the cluster may handles a shutdowned instance gracefully and not affect the running queries.
Please file an issue for that.