FINRAOS/herd

FactoryCQ - Retrieve DDL for high volume without performance issue

nateiam opened this issue · 0 comments

As a Herd Consumer I want to be able to retrieve DDL that contains hundreds or thousands of partitions without significant delay

Currently the underlying query can take over 10 minutes, see technical details below

Acceptance Criteria

  • Generate DDL for data with following characteristics returns in under 30 seconds:
    ** Range or list of 1000 primary partitions
    ** Sub-partition has 5 values for each primary partition
    ** Each partition has 1000 files
  • Meets criteria both with and without suppressScanForUnregisteredSubPartitions flag set to true in Generate DDL request

Technical Notes

  • Technical analysis below shows this is related to use of IN statement for Storage Files and join of Storage Unit
  • Might require total volume of Storage Files at PROD volume