/naive-query-engine

A Toy Query Engine & SQL interface

Primary LanguageRustApache License 2.0Apache-2.0

Naive Query Engine (Toy for Learning) 😄

This is a Query Engine which support SQL interface. And it is only a Toy for learn query engine only. You can check TODO to check the progress now.

Simple enough to learn (Although it is simple...but with so much work to finish.. TAT 😭) and Now it only has a basic architecture and most operators and planners have not implemented (will be done in the future).

This is inspired(and most ideas come) by how-query-engines-work and it is just for learning purpose. And many ideas inspired by arrow-datafusion.

Use arrow to express in-memory columnar format and use sqlparser as SQL parser.

how to use

for now, we can use NaiveDB like below, we can use csv as table storage.

use naive_db::print_result;
use naive_db::NaiveDB;
use naive_db::Result;

fn main() -> Result<()> {
    let mut db = NaiveDB::default();

    db.create_csv_table("t1", "data/test_data.csv")?;

    let ret = db.run_sql("select id, name, age + 100 from t1 where id < 6 limit 3")?;

    print_result(&ret)?;

    // Join 
    db.create_csv_table("employee", "data/employee.csv")?;
    db.create_csv_table("rank", "data/rank.csv")?;

    let ret = db.run_sql(
        "select id, name, rank_name from employee innner join rank on employee.rank = rank.id",
    )?;

    print_result(&ret);
    Ok(())
}

output will be:

+----+---------+-----------+
| id | name    | age + 100 |
+----+---------+-----------+
| 1  | veeupup | 123       |
| 2  | alex    | 120       |
| 4  | lynne   | 118       |
+----+---------+-----------+
+----+-------+-------------+
| id | name  | rank_name   |
+----+-------+-------------+
| 1  | vee   | diamond     |
| 2  | lynne | master      |
| 3  | Alex  | master      |
| 4  | jack  | diamond     |
| 5  | mike  | grandmaster |
+----+-------+-------------+

architecture

The NaiveDB is just simple and has clear progress just like:

impl NaiveDB {
    pub fn run_sql(&self, sql: &str) -> Result<Vec<RecordBatch>> {
        // 1. sql -> statement
        let statement = SQLParser::parse(sql)?;
        // 2. statement -> logical plan
        let sql_planner = SQLPlanner::new(&self.catalog);
        let logical_plan = sql_planner.statement_to_plan(statement)?;
        // 3. optimize
        let optimizer = Optimizer::default();
        let logical_plan = optimizer.optimize(logical_plan);
        // 4. logical plan -> physical plan
        let physical_plan = QueryPlanner::create_physical_plan(&logical_plan)?;
        // 5. execute
        physical_plan.execute()
    }
}

TODO

  • type system
  • datasource
    • mem source
    • csv as datasource
    • empty datasource
  • logical plan & expressions
  • build logical plans
    • projection
    • filter
    • aggregate
    • limit
    • join
  • physical plan & expressions
    • physical scan
    • physical projection
    • physical filter
    • physical limit
    • join
      • (dumb😊) nested loop join
      • hash join
      • sort-merge join
    • physical expression
      • column expr
      • binary operation expr(add/sub/mul/div/and/or...)
      • literal expr
      • so many work to do... TAT
  • query planner
    • scan
    • limit
    • join
    • aggregate
    • ...
  • query optimization
    • more rules needed
  • sql support
    • parser
    • SQL planner: statement -> logical plan
      • scan
      • projection
      • selection
      • limit
      • join
      • aggregate
      • scalar function