Tracking issue for improve TinySQL as a learning-friendly mini distributed relational database
Opened this issue · 2 comments
rebelice commented
It is corresponding to the effort towards Talent Plan v3.0.
According to user feedback and my investigation, I found that TinySQL has serious issues. They make it a departure from the learning-friendly mini distributed relational database:
- Not mini. The TinySQL has more than 100,000 lines of code. It is almost a copy of TiDB, and then part of the code is deleted. It contains a lot of irrelevant code and design.
- Documents unfriendly. It almost only briefly explained the relevant knowledge topics and did not explain the project structure.
- Poor course design. The topics explained in each lab are very large, but the content that needs to be implemented is only a small part.
- Poor comments. They can't help understand the code.
In order to solve the above problems, I will redesign and implement TinySQL. The main improvements in the plan are as follows:
- Redesign the course.
- Divide TinySQL into five stages. Each stage has a clear target and iconic function, and the subsequent stages are based on the previous stage, which is the progression of the previous stage.
- At one stage, we hope that TinySQL is simple enough. As more stages are completed, we will add necessary functions to TinySQL to make it truly a distributed relational database.
- I put the specific stage division at the end of this issue.
- Adopt incremental framework mode.
- Initially, the course framework has no content, and every stage/substage will introduce the framework code that must be required for that stage/substage. Its purpose is to ensure the conciseness of the framework code and clearly show the content introduced at each stage/substage.
- Optimize documentation and comments
- The documentation layout
## Stage ### Introduction #### Objectives #### Materials ### Topic 1 #### Knowledge topic #### Related code ### Exercises ### References
The following is stage design:
- Stage 1: read-only relational database
- Target: the ability to read data using KV engine API
- Iconic function: the ability to handle simple SELECT statements
- Knowledge topic:
- parser
- data mapping from the relational model to KV
- generating operator
- Stage 2: insert and update
- Target: the ability to write data using KV engine API
- Iconic function: the ability to handle simple INSERT/UPDATE statements
- Knowledge topic:
- volcano model
- Stage 3: DDL
- Target: the ability to process DDL online
- Iconic function: the ability to process CREATE/DROP TABLE/INDEX online
- Knowledge topic:
- online DDL algorithm
- Stage 4: Optimizer
- Target: implement an optimizer and be able to choose the appropriate index and Join Order
- Iconic function:
- ability to collect statistics
- ability to choose the appropriate index and Join Order
- Knowledge topic:
- SQL optimization
- statistics
- SystemR optimizer
- Stage 5: Calculation optimization
- Target: optimize the calculation framework to improve performance
- Iconic function:
- vectorization
- Massively Parallel Processing(MPP)
- Knowledge topic:
- vectorization
- MPP
Issues
yanguwan commented
Thanks Rebelice. The topic list is fine as next wave of Talent Plan.
feitian124 commented
👍