Sage-Bionetworks/schematic

schematic: investigate GE latencies and error parsing

milen-sage opened this issue · 5 comments

Based on TT1 feedback from @linglp GE-based rules introduce relatively large latencies; e.g. > 10s.

AC:

  • determine the rules that cause end-to-end latencies > 5s - from REST API request to results - with an example benchmark manifest; @linglp we can use the large manifest you're using, if you could attach it here
  • evaluate latencies differentiating column map vs row map column aggregate expectations and parallelize if needed

@milen-sage for manifest/validate, can we estimate usage? For example, roughly how many concurrent requests that it needs to be able to handle? And how many consecutive requests that it needs to be able to handle as well? I think currently only DCA is using the manifest/validate endpoint. It will be easier for testing and verifying the solution works if the goal is clearer.

@GiaJordan
here's the HTAN manifest that I use: https://www.synapse.org/#!Synapse:syn35488819 You should be able to download it directly from synapse.

  • evaluate latencies differentiating column map vs row map column aggregate expectations and parallelize if needed

After looking back over the code and evaluating my notes, it appears the manual parsing of invalid entries for column aggregate expectations is only a last resort failsafe, and isn't triggered by any of our rules in any of our current testing conditions. I think this is prevented because of how we import our manifests and because of our current set of rules. We should leave it in for continued safety but it's not likely this is currently impacting performance.

Per discussion on slack, next steps will be to

  • 1. Fix issues with logger and levels
  • 2. Create tests manifests for each rule with standard number of rows, columns, and errors
    • Will utilize Invalid_test_manifest.csv and create programmatically during test
    • #1184
  • 3. Benchmark performance on operations (loading, GE, results generation), and method level
  • 4. Benchmark API performance (this issue)

@MiekoHash previous comment updated to reflect progress/updates. Some of the PRs linked are only pending review and ideally will be reviewed at code review tomorrow. I've made progress on the actual API tests for benchmarking; once the prerequisite PRs are merged there shouldn't be a substantial amount of work left for this ticket specifically. Do we need to create issues for the subtasks or is the list fine?

Thanks for asking, @GiaJordan . Since each list item has an associated issue ticket, I would say the list is fine. It was cool to see the test running in action.