projectNe10/Ne10

Regression test failures

Opened this issue · 4 comments

I added some changes to run the regression tests with QEMU usermode on CircleCI and noticed that some of the tests were failing. I thought maybe it was an issue with QEMU, but I also get similar failures when testing on a Raspberry Pi.

Overall build: https://circleci.com/workflow-run/7518e578-47b5-419e-a484-78715f4bfd2b
Example failing build: https://circleci.com/gh/thughes/Ne10/336 (logs available in Artifacts tab)

Example test output logs: https://336-128289476-gh.circle-artifacts.com/0/home/circleci/project/test_logs/NE10_physics_unit_test_static_regression.txt

Snippet:

test_compute_aabb_vec2f_conformance Line 93    Expected 1.459758e+04 (0x46641653) but was 1.459758e+04 (0x46641652) at vector->3 
----vertex_count 33
test_compute_aabb_vec2f_conformance Line 93    Expected 1.459758e+04 (0x46641653) but was 1.459758e+04 (0x46641652) at vector->3 
----vertex_count 34
test_compute_aabb_vec2f_conformance Line 93    Expected 1.459758e+04 (0x46641653) but was 1.459758e+04 (0x46641652) at vector->3 
----vertex_count 35
test_compute_aabb_vec2f_conformance Line 93    Expected 1.459758e+04 (0x46641653) but was 1.459758e+04 (0x46641652) at vector->3 
----vertex_count 36
test_compute_aabb_vec2f_conformance Line 93    Expected 1.459758e+04 (0x46641653) but was 1.459758e+04 (0x46641652) at vector->3 
----vertex_count 37
test_compute_aabb_vec2f_conformance Line 93    Expected 1.459758e+04 (0x46641653) but was 1.459758e+04 (0x46641652) at vector->3 
----vertex_count 38
test_compute_aabb_vec2f_conformance Line 93    Expected 1.459758e+04 (0x46641653) but was 1.459758e+04 (0x46641652) at vector->3 
----vertex_count 39
----vertex_count 40

I think the difference comes from the small difference between the C function and the NEON function.
IMHO, a bit-wise comparison seems far too strict.

Yes, I definitely see what the test is doing (not bitwise):

// this function checks whether the difference between two ne10_float32_t values is within the acceptable error range
ne10_int32_t EQUALS_FLOAT (ne10_float32_t fa, ne10_float32_t fb , ne10_uint32_t err)
{ ne10_float32_t tolerance = (ne10_float32_t)err / 200000;
if (abs(fa-fb) <= tolerance)
{
return 1;
}
else
{
return 0;
}
}

I am more interested in knowing whether the tests should be passing in the latest version of the code. Did they ever pass or were the tests more of a "hopefully we will get them to pass someday" kind of thing?

EQUALS_FLOAT is comparing the difference of two floating point values against a fixed tolerance, 1/20_000. For a value as large as 1.459758e+04, it means no bit-wise difference is allowed.
In the NEON functions, there are optimizations like reordering of operands, or combining of mul and add. They introduce error (compared to the C version) because of different rounding.
Take IIR for an example, I tried revert all change to the NEON IIR function and run the test: https://circleci.com/workflow-run/27c0097f-3eff-48f4-aca8-4d8844fb328d

Similar failure:
test_iir_lattice_case0 Line 327 Expected 6.432297e+01 (0x4280A55C) but was 6.432298e+01 (0x4280A55D) at vector->0

I don't think they ever pass, and I suggest we apply change like commit 6ead727 to physics to avoid the failed tests.

P.S. thank you very much for enabling CI. @thughes

@StanLSun In the build I referenced, the math tests had failures even though they already use the NE10_SRC_ALLOC_LIMIT macro, which limits the range of the randomly generated number:
https://336-128289476-gh.circle-artifacts.com/0/home/circleci/project/test_logs/NE10_math_unit_test_static_smoke.txt