FMI Import cross-check rules - what are suitable "reference values"?
ghorwin opened this issue · 2 comments
Hi all,
since we just had a discussion about "validating a model implementation with measurement data", the following analogy to FMI co-simulation testing came up.
When the purpose is to test the correct implementation (i.e. standard compliance) of a model, differences between simulated results and 'reference' results must be clearly attributable to the implementation aspect.
Consider measurement data vs. simulation data comparison:
real data -> measurement errors/uncertainties -> reference values
model parameter + algorithm -> implementation -> simulation results
Abstract:
- refValues = operatorM(real data, uncertainties, measurement errors)
- simValues = operatorS(model input/parameters, algorithms, implementation errors)
Obviously, deviations between refValues and simValues may be caused by different influencing factors in the operators M and S.
Same applies to co-simulation (specifically import and running a co-simulation).
coSimResults = operatorCS(algorithm, numerical parameters, implementation errors)
assuming model input/parameters are given in a uniquely interpretable way (not yet specified, fully; is linear interpolation of tabulated input values actually required?).
With respect to the differences in results, one of the most influential numerical parameter is (limiting this discussion to CS 1) the constant communication step size.
Also assumed is the fact, that test FMU implementations behave 'well' in numerical stability context -> the results generated should converge with smaller communication step size. Alas, that appears to be not always the case.
However, suppose a testing FMU is well-behaving, the input data is fully given, the algorithm (e.g. fixed time stepping forward with constant given step size) is spezified -> each FMU importing tool should pretty much generate the same results, within rounding error issues (as for example discussed in modelica/fmi-standard#575).
So, when testing for correct implementation of a master scheme is also the purpose of FMI cross-checking (which I assume it is), I suggest providing reference results that have actually been generated with the FMU itself.
Currently it appears, that quite a few CS 1 test cases have reference results generated from an implicit (fully Modelica) simulation, which cannot be obtained even with a fully confirming FMU co-simulation master.
One example for such cases is the ControlBuild/Drill (see attached published "passed" results).
When running the case we every smaller steps (MasterSim uses 0,0001 s steps with 0,01 s output frequency in this example), the results "converge" to a step-like function. The provided reference solution cannot be obtained -> in the analogy above a different operator has been used to generate the results.
With respect to the test criteria I would suggest formulating the following rules for publishing cross-check reference results, that aim at testing correct FMI importing functionality:
- The reference solution has to be generated with the provided FMU itself.
- The algorithm selected for generating the reference results shall be a standard FMI co-simulation algorithm (i.e. "explicit Euler"-type constant step wise stepping) without custom value postprocessing (gliding averages, smoothing techniques etc.).
- The step size shall be selected such, that a further refinement of communication step size does not significantly alter the results (-> time grid refinement study)
- The output result frequency shall be selected such, that significant effects remain observable, when the final results are plotted with linear line segments between points.
- The time interval shall be selected such, that re-sampling at 1000 equidistant points (-> current cross check validation algorithm) will preserve relevant information (steep gradients). Otherwise the simulation time range has to be shortened accordingly.
Sorry for the long ticket texts... this should become an article at some point ;-)
-Andreas
The validation against the reference results was introduced to reject complete nonsense results (e.g. all-zero), missing variables etc. of which we had a lot in the old SVN repo.
Passing the validation (with its very generous delta) cannot be used as a proof of compliance. If we decide to introduce "Reference FMUs" (a working group with the same name has been set up at the Design Meeting in Renningen), we might introduce stricter rules for these FMUs in the next Cross-Check Rules.
Would it be meaningful then to distinguish between "basic functionality" (i.e. FMI importing tool is able to read, analyse and run the FMU without generating nonsense) and "correct calculation behavior"?
If so, an FMI importing tool may get a two-level compliance certificate ("one or two stars"), but that would make it even more difficult for users to differentiate between tools. Ideally, each "compliant" tool would be able to generate correct results, so that engineers can rely on the results.
However, for the "basic functionality" check, I agree that the current rules are sufficient.