filecoin-project/rust-fil-proofs

32GiB synthetic PoRep lifecycle test fails

vmx opened this issue · 2 comments

vmx commented

Description

Currently the 32GiB synthetic PoRep lifecycle test is failing.

The last line of the log are:

2024-04-10T15:56:54.722 INFO storage_proofs_porep::stacked::vanilla::proof > generating synthetic vanilla proofs in a single partition
test test_seal_lifecycle_32gib_porep_id_v1_2_top_8_8_0_api_v1_2 ... FAILED

failures:

---- test_seal_lifecycle_32gib_porep_id_v1_2_top_8_8_0_api_v1_2 stdout ----
thread 'test_seal_lifecycle_32gib_porep_id_v1_2_top_8_8_0_api_v1_2' panicked at 'assertion failed: `(left == right)`
  left: `10`,
 right: `1`', /home/vmx/rust-fil-proofs/storage-proofs-porep/src/stacked/vanilla/proof.rs:144:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

The same failure can also be reproduces with benchy (which might be quicker if you already have a preserved cache):

Create a cache:

RUST_LOG=info FIL_PROOFS_USE_GPU_COLUMN_BUILDER=1 FIL_PROOFS_USE_GPU_TREE_BUILDER=1 FIL_PROOFS_USE_MULTICORE_SDR=1 cargo run --release --features cuda --bin benchy -- porep --size 32GiB --api-features=synthetic-porep --cache /storage/d2/vmx/cache-32g --preserve-cache

Generate the syntheric PoRep:

RUST_LOG=info cargo run --release --features cuda --bin benchy -- porep --size 32GiB --api-features=synthetic-porep --cache /storage/d2/vmx/cache-32g --skip-precommit-phase1 --skip-precommit-phase2 --skip-commit-phase1 --skip-commit-phase2

Acceptance criteria

The test passes again.

Risks + pitfalls

Where to begin

Look why the partition count is different when used through the FFI. I possible fix could be to change the function that returns the PoRep configuration for the synthetic PoRep case.

vmx commented

It looks like the assertion at

is just wrong. It was introduced in #1720.

I'll dig deeper on why I've added that. If I just remove it and run it with benchy, then the verification fails. I don't know if that's related or if it's just a problem with benchy.

It looks like the assertion at

is just wrong. It was introduced in #1720.

I'll dig deeper on why I've added that. If I just remove it and run it with benchy, then the verification fails. I don't know if that's related or if it's just a problem with benchy.

Correct, it's wrong and doesn't make sense to assert there.