Challenge 8: Contracts for `SmallSort`

Question

Challenge 8: Contracts for `SmallSort`

Opened this issue 6 months ago · 3 comments

Link to PR: #57
Link to challenge: https://github.com/model-checking/verify-rust-std/blob/main/doc/src/challenges/0008-smallsort.md

Answer 1 · 2025-01-09T15:27:23.000Z

Hi, I have a few questions about this challenge.

Missing safety requirement

I believe merge_down is missing a requirement in the SAFETY comment. Specifically, dst.sub(1) is called, but the SAFETY comment does not guarantee that the resulting pointer remains within the same allocated object. This is not a significant issue since it's a private function, and when it is called, dst.sub(1) is indeed within the same allocated object.

EDIT: I've opened the following issue: rust-lang#135984.

Can the absence of "invalid values" be verified with a separate tool?

"Invalid values" refer to the following scenario: If the input array is [1, 2, 3] and the output array is [1, 1, 1], then although the output array is sorted, an "invalid value" is created. In this example, it is not a safety concern, but it could become one for non-Copy types.

I believe this requires a separate tool because verifying with kani that the multiset (like a set but counting repetitions) of inputs matches the multiset of outputs is extremely slow. For example, sort4_stable already takes approximately 6 minutes to verify.

How arbitrary does `is_less` have to be when verifying correctness?

To verify safety, since the sorting methods are safe and are called with a user-provided is_less, verification has to be done with a function that returns a random boolean (kani::any()).

However, what is_less should be used when verifying correctness? For example, is it sufficient to verify correctness only for the default ordering of integers? Alternatively, should verification be conducted on a custom type with 32 distinct values, allowing for testing of any weak total ordering on these 32 values? This second approach is likely to perform poorly.

Additionally, what types T should be tested?

What is the limit for memory and runtime?

I need to use stub_verified on sort13_optimal because the code that kani analyzes becomes too large when verifying the sorting methods. However, verifying sort13_optimal is very resource-intensive due to the large number of statements it contains, as explained in this comment. It requires 50 GB of memory and takes 40 minutes (38 GB of this is swap memory, which might explain the lengthy runtime on my laptop).

Answer 2 · 2025-01-24T02:27:43.000Z

Missing safety requirement

Thank you. The safety comments should be updated in the upstream if the proof shows that the current comments is actually not enough.

Can the absence of "invalid values" be verified with a separate tool?

Yes, other tools are welcome.

How arbitrary does is_less have to be when verifying correctness?

The ultimate goal is to prove the correctness for arbitrary types and arbitrary ordering. However, we understand that model arbitrary ordering itself is very challenging. So proofs for primitive types with default ordering will be considered as a solid solutions.

Answer 3 · 2025-01-24T18:07:20.000Z

What is the limit for memory and runtime?

We expect to run all proof on CI. So ideally the running time of a single proof should be less than ~2 minutes on a GitHub standard runner. However, if the proof is running into the limitation of Kani/CBMC, we suggest you to open a PR with the proof so that we could merge it once we have made performance improvements that get the CI time down to a reasonable amount.