rtfb/riscv-hobby-os

Integration test is flaky

Opened this issue · 1 comments

rtfb commented

It seems to be very rare, but I just stumbled into a case where the integration test broke because the parked hart printed its status faster than the init hart did:

--- testdata/want-output-u64.txt	2022-01-09 14:54:24.705689947 +0000
+++ out/test-output-u64.txt	2022-01-09 14:55:07.326003981 +0000
@@ -1,8 +1,8 @@
+cpu parked: 1
 kinit: cpu 0
 Reading FDT...
 FDT ok
 bootargs: dry-run
 kprintf test several params: foo, 0xF10A, 0
-cpu parked: 1
 KKKK
 qemu-launcher: killing qemu due to timeout
make: *** [Makefile:94: out/test-output-u64.txt] Error 1
Error: Process completed with exit code 2.

This failure makes sense, the harts can execute independently. We need to invent some synchronisation primitive in order to get guaranteed output order.

rtfb commented

The issue with parked harts was fixed in fe3db6a by simply not printing that string. However, the tests are now flaky in another way:

--- testdata/want-smoke-test-output-u64.txt	2022-10-20 17:29:02.503101310 +0000
+++ out/smoke-test-output-u64.txt	2022-10-20 17:31:46.3569[8](https://github.com/rtfb/riscv64-in-qemu/actions/runs/3291666209/jobs/5426107546#step:8:9)66[9](https://github.com/rtfb/riscv64-in-qemu/actions/runs/3291666209/jobs/5426107546#step:8:10)2 +0000
@@ -19,5 +19,6 @@
 1    S      sh
 4    S      hang
 6    R      ps
-
+q
 qemu-launcher: killing qemu due to timeout
+emu-system-riscv64: terminating on signal [15](https://github.com/rtfb/riscv64-in-qemu/actions/runs/3291666209/jobs/5426107546#step:8:16) from pid 44[16](https://github.com/rtfb/riscv64-in-qemu/actions/runs/3291666209/jobs/5426107546#step:8:17) (python3)
make: *** [Makefile:227: out/smoke-test-output-u64.txt] Error 1

This seems to be predominantly happening with the 64 bit version for some reason.