hscells/groove

Refactor pipeline for better memory efficiency.

Closed this issue · 0 comments

Currently, a grove pipeline will consume memory without writing out or returning any results. To combat this, what I'd like to do is wrap the current pipeline into one that can communicate results coming out of the pipeline through a channel. In this way, the most demanding and memory hungry section of the pipeline (query execution/trec results/evaluation) can be split up and processed separately.

To do this, I'm going to modify how PipelineResult works, and set flags for what's contained in it so a receiver can act appropriately. I'm going to wrap up this channel in case the use case isn't needed (not dealing with big data/don't care about performance that much) but also expose the channel method to speed up processing.