Casal2/CASAL2

Random failures in ModelRunner

Zaita opened this issue · 0 comments

Zaita commented

Describe the bug
There have been random failures in the ModelRunner validation process. This process runs a subset of models and validates the objective function value to ensure Casal2 is matching a pre-known value.

Release version(s) and/or repository branch(es) affected?
V1.0, V1.1, All branches.

This has been fixed in Master.

Operating system type and version (and build tools types and versions, if applicable)
Any/All. Chances that it'd manifest is random.

Steps to reproduce the bug
Build a release betadiff binary, then run the modelrunner. If you get failures, then it might be due to this.

Expected behavior
ModelRunner would pass all validation tests regardless of platform.

Actual behavior
Azure DevOps Windows VM would fail 3 tests

Screenshots
N/A

Additional context
This bug was introduced as a combination of two changes made to the Asserts.ObjectiveFunction object. The first change modified the tolerance check from a static (absolute) value to a relative value.
12 Nov 2019 - 85c75a2
This introduces a bug where the validation may pass when it should fail. It's reducing the tolerance implicitly by making the comparison check relative to the objective score. This shouldn't be used, as the value Casal2 is checking against is supplied by user/developer and is know. So any difference > tolerance should be considered a calculation error.

The second change that created the random bug was to counter the above change slightly by adding decreased tolerance for the comparison.
10 Jan 2020 fbf5121
This changes the tolerance value from 1e-5 to 1e-6. The issue with this is that the model runner files have user defined values with 5 decimal places of precision, so they are at most 1e-5 comparable. Changing this to 1e-6 means that the final digit used for comparison will be randomly generated based on current memory conditions of the host system.

This second change increased the accuracy requirements of the check, semi-negating the first change, but did so by also adding in a random component.

Pull requests welcome!
This is an Open Source project - please consider contributing a bug fix
yourself (please read the Contributors Manual and the Contributors Guidelines before starting any work).