softwarelanguageslab/maf

Persisting the analysis state

Opened this issue · 3 comments

Currently, it is not possible to save the state of an analysis to a file, and to restore it afterwards. This hampers (performance) testing and debugging, especially for incremental analyses: to test an incremental update, a full initial analysis must be run beforehand (to deliver the analysis state that is to be updated), and a full reanalysis must be run afterwards (to verify the correctness of the analysis). Therefore, adding persistent storage of analyses will lead to a major improvement in the debug and test cycle, and can be useful for benchmarking as well.

Analyses can possibly be stored using the .json format.

Analyses can now be (de-)serialised. (See commit 89d5a7f)
However, the deserialised analysis will not use the same value domain. Hence, values cannot be compared to other analyses, but would have to be converted.

I have tried unsuccessfully at using other serialization frameworks such as jackson, borer, and circe.

The main issue stems from our open design. Given the lack of documentation and cryptic errors from each serialization library, I wanted to go in baby steps. Abstract values make the most sense to have serialized first, but:

1. To serialize Value in ModularLattice (single abstract values), you need to serialize L too (collections of abstract values) because e.g., Cons(car: L, cdr: L) is a Value, and to serialize L you need to serialize Value, so you have some circularity here already
2. To serialize Value, you also need to serialize Address, for pointers
3. To serialize Address, which is really an open type extended in multiple places, you need to serialize Component (due to ReturnAddr for example)
4. To serialize Component, you need to serialize ComponentContext, and Environment[Address] as well as to integrate serialization in all places where you can bind Component
5. To serialize ComponentContext, you need to integrate serialization in all places where you can bind it
6. To serialize Environment[Address], you need to serialize... Address (cf. 3), which introduces circular serialization issues again

So doing it in baby steps is not possible, and this requires a careful analysis to have a clean implementation

Having this feature would greatly improve debugging, as it permits running an analysis once and inspecting the results afterwards in different manners, without having to rerun the analysis when one wants to inspect another aspect of the result. This could also help with debugging other code, such as benchmarking code (e.g., for the precision evaluation), when unexpected results are obtained.