bakwc/PySyncObj

test_checkBigStorage randomly fails with an AssertionError

Opened this issue · 4 comments

test_checkBigStorage sometimes fails with an AssertionError in line 595 or 596. The failures happen only sometimes, suggesting that it's a timing issue and one of the stopFuncs might need another condition (e.g. the one after the log compaction, to ensure that the data was indeed dumped to disk?).
I've seen failures on both lines mentioned above, i.e. there are cases where o1 has the correct value but o2 doesn't and vice-versa. In all cases I've seen, getValue('test') returns None, i.e. the value is missing entirely, not corrupted.

My platform is a Debian machine with Python 3.6.

Bash command to run this test repeatedly:

declare -i i=0; while [[ $i -lt 10 ]]; do pytest -k 'test_checkBigStorage' test_syncobj.py; i+=1; done

I just noticed that some files dump1.bin.1.tmp etc. are left over in the PySyncObj directory after these test failures. The tmp extension again suggests that the issue is related to the log compaction/serialiser.

This leftover temporary file is the one produced by Serializer.setTransmissionData. The filename corresponds to the object that fails, i.e. if the assert for o1 fails, dump1.bin.1.tmp remains in the directory.

In the meantime, I've also seen cases where no .tmp files were left over but only dump1.bin and/or dump2.bin. I've also had a case where there was both a dump1.bin and a dump1.bin.1.tmp (and a dump2.bin, in this particular case).

The failures happen especially when the machine is under high load, so it really looks timing-based. My theory is that the serialisation and/or deserialisation doesn't finish before the doTicks timeout is reached. Perhaps it would be wise to add a method or two to SyncObj to wait for the (de)serialisation to complete.

Also, I think it might be a good idea to let a test fail entirely if doTicks is stopped because of the timeout rather than stopFunc, at least in most cases.

Correction: I meant the dumpN.bin.1.tmp file, not dumpN.bin.tmp (which would be produced by Serializer.serialize). I've corrected the comments above.