MaterializeInc/datagen

Feature: Customize statistical distribution for relationship `records_per`

chuck-alt-delete opened this issue · 0 comments

Right now, we can set records_per equal to, say, 5, which means there are 5 child records for every parent record. But what if we want another, non-uniform distribution? Like, sometimes there are 100 child records for a parent record, and sometimes there are 2, following some statistical distribution.

This could potentially help the optimizer team as they investigate performance issues related to different distributions of join keys. cc @aalexandrov