storacha/w3filecoin-infra

Buffering lambda execution time gets too high when number of pieces increase

Opened this issue · 1 comments

This caused the pipeline to get congested with timeouts recently. See aggregates from 6th and 7th January

Example aggregates to analyse:

  • bafkzcibcaapjqqcrtp4zirffd53geuc6aobzacndjbqf6p7zbcrx77omccdrqoa
  • bafkzcibcaapov2hh7lsjc5qyzng5uzejhl5omqzhzyjctjd6e7jrct6gtkj7yca

We should run the lambda code locally and see what makes the execution time spike

It's hashes. FML it's hashes.

Aggregates with 40,000+ pieces require 2,000,000+ hashes and there's simply not enough time to do that much work.

Things I have done:

  • Increase time allocated to the lambda to maximum 15 minutes.
  • Increase memory allocated to the lambda (apparently this increases CPU perf). This didn't really have any effect, and also I subsequently read that for single threaded lambdas (JS) adding memory beyond 4GB yields only marginal gains.
  • Allow custom hasher to be used in aggregate building: storacha/data-segment#42 and storacha/data-segment#44
  • Use Node.js native hasher #107
  • Use new Node.js crypto.hash(...) which is faster than createHash+update+digest: #108
  • Configure maximum aggregate pieces (effectively an upper bound on number of hashes that need to be done) storacha/w3up#1566