executor: optimize `NestedLoopJoin`'s memory usage

risinglight/src/executor/nested_loop_join.rs

Lines 26 to 32 in 7c5e0d5

    
           let (left_chunks, right_chunks) = async { 
        
               tokio::try_join!( 
        
                   self.left_child.try_collect::<Vec<DataChunk>>(), 
        
                   self.right_child.try_collect::<Vec<DataChunk>>(), 
        
               ) 
        
           } 
        
           .await?;

Currently, NLJ will pull all data into memory before start processing data. This would make memory usage high.

Optimally, NLJ should work like...

for chunk in left {
  for chunk in right {
    // process chunk
  }
}

However, this requires us to support restarting a child executor (restart the right child), which is not possible in our system. Therefore, for the first step, we'd like to optimize NLJ like...

let bufferred_left = pull all data from left side
for chunk in right {
  for chunk in bufferred_left {
    // process chunk
  }
}

hello, may i try this?

I have a question about this. Now, NLJ is waiting for enough row(one DataChunk) to eval and filter(this is deferred),
but when one right_chunk is obtained from right_child, it must be processed immediately(right outer join).
Maybe should hold a HashMap to record the right_row's state(Whether there is a match left_row)?

I have a question about this. Now, NLJ is waiting for enough row(one DataChunk) to eval and filter(this is deferred),
but when one right_chunk is obtained from right_child, it must be processed immediately(right outer join).
Maybe should hold a HashMap to record the right_row's state(Whether there is a match left_row)?

I think for the first step, we can choose to support join of only one direction (either left join or right join)? The optimizer can rewrite left join to a right join, and back forth.

we can choose to support join of only one direction (either left join or right join)

this is easy work, but maybe the optimizer is hard work( how to support full outer join and multi-table join)?
Is it necessary to complete the optimizer part before continuing this issue or synchronization?

how to support full outer join and multi-table join

I think we can have some regression in tests (maybe we need to disable some of the test cases in https://github.com/risinglightdb/risinglight/blob/main/tests/sql/join.slt)? Just took a look at the current plan, it seems that for outer join and right outer join, optimizer will always choose NLJ. Therefore, we can disable some test cases in join.slt.

Is it necessary to complete the optimizer part before continuing this issue or synchronization?

I think we can optimize NLJ's memory usage first, and support only inner and left outer join. Some fun facts: SQLServer's NLJ also doesn't support full outer join and right outer join...

For the next step, we can add full outer join support to HashJoin, and enable the optimizer to rewrite right outer join to left outer join in NLJ, so as to enable all test cases again. So after all these have been done, we will require at least one equal condition in full outer joins, and the NLJ will only need to support inner join and left outer join.

Does this sound good? Also thanks for your contribution!

This sound good!
Now, the NLJ results are a little different than before. Is it ok?

Before

v1	v2
l1	r1
l1	r2
l2	r1
l2	r2

Now

v1	v2
l1	r1
l2	r1
l1	r2
l2	r2

Can keep the test of inner and left outer join?

Sounds good!

	let (left_chunks, right_chunks) = async {
	tokio::try_join!(
	self.left_child.try_collect::<Vec<DataChunk>>(),
	self.right_child.try_collect::<Vec<DataChunk>>(),
	)
	}
	.await?;