araddon/qlbridge

Performance Regression

araddon opened this issue · 1 comments

Some sources (think json files) have sparse columns within a row, that is vs the entire possible keys within a keyspace (row) any one row may have a subset. The messages that now require schema enforce creation of non-sparse rows representing these sparse rows as they go through sql exec tasks. This is really slow for very, very sparse data sets.

  • create benchmark test for current exec engine with tests of these methods:

Benchmark tests added: #152 and regression turned out to be un-related red-herring (file-download over internet times)