This is a Query Engine which support SQL
interface. And it is only a Toy for learn query engine only. You can check TODO to check the progress now.
Simple enough to learn (Although it is simple...but with so much work to finish.. TAT 😭) and Now it only has a basic architecture and most operators and planners have not implemented (will be done in the future).
This is inspired(and most ideas come) by how-query-engines-work and it is just for learning purpose. And many ideas inspired by arrow-datafusion.
Use arrow to express in-memory columnar format and use sqlparser as SQL parser.
for now, we can use NaiveDB
like below, we can use csv as table storage.
use naive_db::print_result;
use naive_db::NaiveDB;
use naive_db::Result;
fn main() -> Result<()> {
let mut db = NaiveDB::default();
db.create_csv_table("t1", "data/test_data.csv")?;
let ret = db.run_sql("select id, name, age + 100 from t1 where id < 6 limit 3")?;
print_result(&ret)?;
// Join
db.create_csv_table("employee", "data/employee.csv")?;
db.create_csv_table("rank", "data/rank.csv")?;
let ret = db.run_sql(
"select id, name, rank_name from employee innner join rank on employee.rank = rank.id",
)?;
print_result(&ret);
Ok(())
}
output will be:
+----+---------+-----------+
| id | name | age + 100 |
+----+---------+-----------+
| 1 | veeupup | 123 |
| 2 | alex | 120 |
| 4 | lynne | 118 |
+----+---------+-----------+
+----+-------+-------------+
| id | name | rank_name |
+----+-------+-------------+
| 1 | vee | diamond |
| 2 | lynne | master |
| 3 | Alex | master |
| 4 | jack | diamond |
| 5 | mike | grandmaster |
+----+-------+-------------+
The NaiveDB is just simple and has clear progress just like:
impl NaiveDB {
pub fn run_sql(&self, sql: &str) -> Result<Vec<RecordBatch>> {
// 1. sql -> statement
let statement = SQLParser::parse(sql)?;
// 2. statement -> logical plan
let sql_planner = SQLPlanner::new(&self.catalog);
let logical_plan = sql_planner.statement_to_plan(statement)?;
// 3. optimize
let optimizer = Optimizer::default();
let logical_plan = optimizer.optimize(logical_plan);
// 4. logical plan -> physical plan
let physical_plan = QueryPlanner::create_physical_plan(&logical_plan)?;
// 5. execute
physical_plan.execute()
}
}
- type system
- datasource
- mem source
- csv as datasource
- empty datasource
- logical plan & expressions
- build logical plans
- projection
- filter
- aggregate
- limit
- join
- physical plan & expressions
- physical scan
- physical projection
- physical filter
- physical limit
- join
- (dumb😊) nested loop join
- hash join
- sort-merge join
- physical expression
- column expr
- binary operation expr(add/sub/mul/div/and/or...)
- literal expr
- so many work to do... TAT
- query planner
- scan
- limit
- join
- aggregate
- ...
- query optimization
- more rules needed
- sql support
- parser
- SQL planner: statement -> logical plan
- scan
- projection
- selection
- limit
- join
- aggregate
- scalar function