Buffering lambda execution time gets too high when number of pieces increase

Question

Opened this issue a year ago · 1 comments

This caused the pipeline to get congested with timeouts recently. See aggregates from 6th and 7th January

Example aggregates to analyse:

We should run the lambda code locally and see what makes the execution time spike

Answer 1 · 2024-10-20T20:45:46.000Z

It's hashes. FML it's hashes.

Aggregates with 40,000+ pieces require 2,000,000+ hashes and there's simply not enough time to do that much work.

Things I have done:

Increase time allocated to the lambda to maximum 15 minutes.
Increase memory allocated to the lambda (apparently this increases CPU perf). This didn't really have any effect, and also I subsequently read that for single threaded lambdas (JS) adding memory beyond 4GB yields only marginal gains.
Allow custom hasher to be used in aggregate building: storacha/data-segment#42 and storacha/data-segment#44
Use Node.js native hasher #107
Use new Node.js crypto.hash(...) which is faster than createHash+update+digest: #108
Configure maximum aggregate pieces (effectively an upper bound on number of hashes that need to be done) storacha/w3up#1566