valkey-io/valkey

[Test Failure] Test failure in 32bit defragmentation

Closed this issue · 9 comments

@zvi-code Can you take a look?

yes, looking

@madolson building on ubuntu x86 with 32bit flag, as instructed here, does not reproduce the issue and tests pass successfully. Any advise on how to reproduce?

Looking at the failures

*** [err]: Active defrag big list: standalone in tests/unit/memefficiency.tcl
Expected 1.43 >= 1.7 (context: type eval line 40 cmd {assert {$frag >= $expected_frag}} proc ::test)
Cleanup: may take some time... OK
*** [err]: Active defrag big list: standalone in tests/unit/memefficiency.tcl
Expected 1.62 >= 1.7 (context: type eval line 40 cmd {assert {$frag >= $expected_frag}} proc ::test)
Cleanup: may take some time... OK
*** [err]: Active defrag big list: standalone in tests/unit/memefficiency.tcl
Expected 1.30 >= 1.7 (context: type eval line 40 cmd {assert {$frag >= $expected_frag}} proc ::test)
Cleanup: may take some time... OK

This failures mean we didn't reach the desired fragmentation, it's not strictly related to defrag mechanism, it's the preconditioning phase of the defrag that is not met. My PR showed slight improvement in defrag time, so maybe it also affected timing somehow

I think it's a bit random, so run 10 or 100 times to reproduce it.

The test case should work harder and create more fragmentation to be sure to meet the pre-condition?

This is my assumption, maybe wait a little longer, because we see the fragmentation achieved is not identical every time it runs (we do not validate value is in some range, only check it's at least some value), this indicates the test is not deterministic

I was able to reproduce it (only using the daily.yml). After several attempts, I figured the default config for lazyfree-lazy-user-del is yes, so when we delete an object it depends on the bio lazy free thread to when it will be actually freed.
When disabling lazyfree for the specific test that fails the 32bit passes successfully.

I will run more tests, but assuming it's validated, I would suggest running defrag without lazyfree at least for the pre-conditioning stage, WDYT? @madolson , @zuiderkwast, @ranshid

Good findings! @enjoy-binbin changed to lazy del by default in #913

Let's disable lazy del for this test case, yes. That was done in other test cases when we changed the default but we must have missed this one.