sosy-lab/sv-benchmarks

Broken generation script

hernanponcedeleon opened this issue · 5 comments

The script c/nla-digbench-scaling/generate.py is used to automatically generate version of the benchmarks in c/nla-digbench/.

I think the script assumes in lines 78 and 82 that the property file contains only one property (if not, it might change the expected results in all the categories and not only in unreach-call with I think is the intended behaviour).

The script does not generate a scaled version if the there is more than one property. This might have been the case when the script was created, but it is not anymore since we fixed overflows in the nla-digbench benchmarks.

I think the script assumes in lines 78 and 82 that the property file contains only one property (if not, it might change the expected results in all the categories and not only in unreach-call with I think is the intended behaviour).

Yes, I wrote the script that way to deal with multiple properties once that occurs an not silently ignore such cases. Thanks for the hint, I will have a look at how this can be fixed! Overflows might not be present in the bounded versions, so this requires some inspection of the actual tasks.

but it is not anymore since we fixed overflows in the nla-digbench benchmarks.

Can you point me to the PR in which the overflows were fixed? I don't seem to find it. EDIT: I think I found it: #1155

That PR actually didn't add the no-overflow:true labels for the tasks that are fine, right? I could add them in the fix for this issue if you agree that this is desired.

Actually, I think what we want to do (at least initially) is to generate the scaled versions for those that DO NOT HAVE overflows.

I agree that it would also be good to have scaled version for the no-overflow property, but I'm more worried about having unsound benchmarks for reachability. I think the fix for the reachability ones should be easier than the no-overflow (as you said, those require some per-case inspection).

That PR actually didn't add the no-overflow:true labels for the tasks that are fine, right? I could add them in the fix for this issue if you agree that this is desired.

Yes it did, check any of the e.g. *-ll.yml files.

Actually, I think what we want to do (at least initially) is to generate the scaled versions for those that DO NOT HAVE overflows.

Yes, totally agree, that is what I just did in #1200, could you have a look?

Yes it did, check any of the e.g. *-ll.yml files.

not for all of them, 8 seem to have been forgotten:

sv-benchmarks/c/nla-digbench $ grep overflow *.yml |wc -l
45

sv-benchmarks/c/nla-digbench $ ls *.yml |wc -l
53

Yes, totally agree, that is what I just did in #1200, could you have a look?

Sure.

not for all of them, 8 seem to have been forgotten:

sv-benchmarks/c/nla-digbench $ grep overflow *.yml |wc -l
45

sv-benchmarks/c/nla-digbench $ ls *.yml |wc -l
53

Those 8 are probably the original benchmarks that did not have overflow and thus not modified in #1155.
I'll add the property in a separate PR.

This has been solved already.