sosy-lab/sv-witnesses

Alternative Format for Correctness Witnesses

MartinSpiessl opened this issue · 17 comments

Mostly, we are just interested in the (loop) invariants (i.e. formulas and the locations in the program where they hold) inside a correctness witness. The current GraphML-based format that describes protocol automata is actually more expressive, since we can e.g. encode things like "I want the invariant to hold on every third loop iteration".

However my experience from having inspected many witnesses shows that most tools actually do not make use of this, neither do many validators. But the complicated GraphML format probably hinders the otherwise easy task of outputting invariants.

In practice this has especially led to the fact that a tool will just output a minimal witness with only one node and no invariants.

I therefore propose an alternative format for the correctness witnesses which is easier to output by a tool and which could even be translated into a witness in the current format by a separate tool (would be fairly easy to implement into CPAchecker).

My proposal is to have the following format that is easy to create and parse. Let's start with an example as this will make things clearer:

WTNS/0.1
witness-type: correctness_witness
sourcecodelang: C
producer: CPAchecker 1.6.1-svn
specification: CHECK( init(main()), LTL(G ! call(__VERIFIER_error())) )
programfile: multivar_true-unreach-call1.i
programhash: a2577bafced442fc9d7239226a9a4234883426ad
architecture: 32bit

--- multivar_1-1.c      2020-09-29 16:58:36.541697057 +0200
+++ multivar_1-1.c      2020-09-29 17:05:35.816261163 +0200
@@ -14,10 +14,12 @@
   unsigned int x = __VERIFIER_nondet_uint();
   unsigned int y = x;

+  /*@ loop invariant x==y;*/
   while (x < 1024) {
     x++;
     y++;
   }

+  //@ assert x<=1024;
   __VERIFIER_assert(x == y);
 }

We separate a witness into a header and a body, very similar to the HTTP protocol. The first line contains the protocol name and version (in case we want to change the format in the future). Then we have on each line our metadata in the format <key>: <value>. For the specification this would limit us to have single line specifications, but I think this actually acceptable. Currently our specs all fit one line and they will probably not get larger in the near future.

We signal the end of the header by an empty line. After this we put the body, which consists of a patch (created via diff -u) that
will show where in the program we would need to add annotations (I propose ACSL annotations, as these are well documented and specified, cf. http://frama-c.com/acsl.html) that have the semantics we want. For now it would be enough to have the loop invariant and maybe the assert clauses. Most validators will really only want the loop invariants currently. This would naturally allow us to also specify more complex features in the witnesses in the future.

A nice side effect is that the witnesses would also sometimes be able to match the annotations even if the program changed slightly. If not, users can easily understand why such a patch doesn't apply, because this is well-known from standard software development, e.g. with git. One can also just apply the patch and get a program that can then have a file that can be passed to tools that understand these annotations. Also users could easier create own witnesses manually by just adding the right annotations and creating the patch (and putting it into a file with the header in front). The size of correctness witnesses might also reduce greatly by this.

This format builds upon already well known and mature concepts like the HTTP protocol, ACSL, and the diff/patch tools.

Just a quick comment: I like the principal idea very much, as it also addresses the issue that the current correctness witness format does not support quantifiers and function contracts with \old and \result. It would be good to add function contracts (ACSL 1.15 §2.3.2) and the builtin constructs \old and \result (ACSL 1.15 §2.3.1) to this format.

I also like it. Do you have a proposal how to use this new format for violation witnesses, as well?

For the record: One of the reasons for the current format was that it really expresses only the witness, and there is no way for it to (accidentally) contain changes to the actual program. Because of that for example it was rejected to let the verifier output a changed program that includes an invariant. With this format, we would still need to make sure somehow that only valid changes to the program are represented in the diff.

Also, I am not sure whether it is really easy to generate? For example, we would probably have to dump out the program and then manually call diff. But this is inconvenient and would not work on systems where diff is not installed.

Why not to use just a list of line:annotation? E.g.:

WTNS/0.1
witness-type: correctness_witness
sourcecodelang: C
producer: CPAchecker 1.6.1-svn
specification: CHECK( init(main()), LTL(G ! call(__VERIFIER_error())) )
programfile: multivar_1-1.c
programhash: a2577bafced442fc9d7239226a9a4234883426ad
architecture: 32bit

file: multivar_1-1.c
18: @ loop invariant x==y
23: @ assert x<=1024

The validator would read the list and associate the annotations with the locations of the program before stepping onto the line. It should be rather easy to generate such a witness and easy to generate the human-friendly diff for the user.

I also like it. Do you have a proposal how to use this new format for violation witnesses, as well?

I do not see an immediate way to do this with ACSL annotations. Both concepts (annotations and witnesses) are actually equivalent in their expressiveness if you allow ghost variables in the annotations because that way you could track state. But this will then be way more verbose than I would like it to be, the goal here is to make it simple.

But actually all you really need for violation is the sequence of branching decisions at each branching location. So something like annotating each branching with something as simple as //@ 1 0 1 1 0 1 would be enough (this is not ACSL syntax though!). You could even mark some branching decisions with "don't care", lets name this x, then //@ 1 x 0 1 would mean go right, don't care, go left, go right. At first it might look like a problem that we cannot mark arbitrary long cycles here, but in the end a violation witness should really show a concrete counterexample, and this can always be represented with a finite sequence of branching decisions. We could of course also add arbitrary long cycles by a nondeterministic marker in the sequence, but now it gets too detailed.

If you have any insights on this from your experience with testing I would like to hear your suggestions, but I guess it would make more sense to discuss this independently (new issue?) from this issue where we talk about correctness witnesses.

Why not to use just a list of line:annotation? E.g.:

Yes, that also came to my mind. We could even have a switch in the header that easily allows different formats like the one you describe in the body. The problem here is that it assumes that a line is enough to exactly pinpoint the location where the annotation should be applied. But in theory I could strip all newlines from a C file and it would still be valid, and then this approach would be imprecise. We could extend the key to be offsets (bytes counting from the beginning of the file), this would be largely equivalent to a patch! So we are not really talking about different things, just about details how to compactly write down these annotations.

The idea behind the patch was to have something that can also immediately be understood and useful for a human. But both suggestions good and would complement the current witness automata well I guess.

EDIT: Your suggestion would actually make it more easy to address the points mentioned by @PhilippWendler, namely that we want to ensure that the program semantics is not changed. Converting between a patch and the offset based variant you proposed would actually be really easy and fast, so we could also just pick one and provide a converter for the other, or allow both and have two converters between them. Then everyone could choose what they like best/ can implement the best.

For the record: One of the reasons for the current format was that it really expresses only the witness, and there is no way for it to (accidentally) contain changes to the actual program. Because of that for example it was rejected to let the verifier output a changed program that includes an invariant. With this format, we would still need to make sure somehow that only valid changes to the program are represented in the diff.

You are right, this has to be addressed, and I thought about this beforehand. It should be easy to check whether a patch changes the C syntax tree or not (some parsers will add comments to the syntax tree, but these problems can be overcome I guess). We are already introducing the witness linter so such a check could be provided there.

@mchalupa's propasal would also naturally be immune to this kind of problem.

Also, I am not sure whether it is really easy to generate? For example, we would probably have to dump out the program and then manually call diff. But this is inconvenient and would not work on systems where diff is not installed.

I think we are getting somewhere. There is no doubt that the proposal by @mchalupa would be easy to generate, you just need to determine the offset where the annotation should be put. Then having a converter into a diff can be left as homework for another tool that should not be too hard to create.

Even if there is not diff installed, this is such a common tasks that I am sure many programming languages will have some library that provides diff functionality. For Java I found https://github.com/java-diff-utils/java-diff-utils after a quick search for example.

@mchalupa proposal has the drawback that you need to keep track of the line numbers. Is line 18 a line in the original program? Should the annotation be inserted before or after that? What about multi-line annotations? I feel like the original proposal is easier to generate and to interpret (by humans as well as by tools).

But I like the file: ... part, because at some point, we want to have witnesses for multiple files.

@PhilippWendler You are right that we wanted to ensure that the original program is not changed. But it is easier to do that with this proposal than with GraphML because you can just apply the patch, diff again with the original file and ensure that there are only comments. It is actually much better because we can use already available tools and no one has to implement everything for themselves.

I would also argue that it is easier (or at least as easy) to generate. You already have the input file and every tool that is capable of generating Floyd-Hoare style invariants and contracts needs to track where they go and has probably an interest in showing them to the user. In contrast, I do not think that many tools are by themselves interested in showing the user their CFA.

@PhilippWendler You are right that we wanted to ensure that the original program is not changed. But it is easier to do that with this proposal than with GraphML because you can just apply the patch, diff again with the original file and ensure that there are only comments. It is actually much better because we can use already available tools and no one has to implement everything for themselves.

With the existing format one does not implement anything nor use an existing tool. The witness format is simply incapable of representing changes to the actual program. So while you might say that it can be still easy (though I find having to parse C in a diff and check that there are only comment changes can be quite tricky, imagine a tool inserting /* ... */ comments in an existing line), it cannot be easier than now.

The witness format is simply incapable of representing changes to the actual program

I think there is a misunderstanding. Of course the witnesses do not change the program, but the protocol automaton described by a correctness witness in the current format could actually remove parts of the state space. Such a witness would then violate the format specifications for a correctness witness, but not all tools might have strict checks for this. This is one reason for why we now have the witness linter (cf. PR #16 ).

though I find having to parse C in a diff

You would not need to parse the C tokens in the diff, nobody proposed that. In the end the solution with the diff is also just an alternative to specifying the annotations via offsets (most like Marek proposed), that one would definitely be easier than now!

it cannot be easier than now.

See comment above.

For Daniels comments:

But I like the file: ... part, because at some point, we want to have witnesses for multiple files.

I think so too. Btw patches are also possible for multiple files, so this is kind of natural there too.

@mchalupa proposal has the drawback that you need to keep track of the line numbers. Is line 18 a line in the original program?

I think that should be clear by now, but can be fixed easily by adding an offset.

What about multi-line annotations?

Should be doable also in Marek's proposal by just adding the block comments again as natural delimiter:

file: multivar_1-1.c
18-123: /*@ loop invariant x==y*/
19-456:
/*
assert x<=1024;
assert x>=0;
*/

123 and 456 are imaginary file offsets to prevent anyone from nitpicking about that detail again 😉.

The current plan would be to let @SvenUmbricht implement support for this (either patch based or location:annotation based as proposed by Marek) to CPAchecker as a demo and also converter functionality between the current format and this proposal. Maybe we could then try to have Frama-C as validator, would make a nice addition. He already implemented ACSL support so I do not think this will require too much work.

@PhilippWendler
Formally, the format is incapable in doing so. But from an implementation view its quite easy to introduce errors, in particular with partial witnesses.

@MartinSpiessl
Removing parts of the state space can only happen if you have an incorrect correctness witness.

Regarding diff vs. linenumber + offset: sorry, it is still not clear to me.

But I am all in favor of having someone implement a demo if you know how it can work! Using Frama-C as validator is also a nice idea.

I am also in favour of the idea having a (additional) simple format.

But I am also a little bit confues regarding the discussion "diff vs. linenumber + offset":
Is there any drawback of using a seperate file with the linenumber + offset, except that that using a single file is easier to read by users?

Is there any drawback of using a seperate file with the linenumber + offset, except that that using a single file is easier to read by users?

I don't get how your question related to the statement about what confuses you (the question does not talk about diffs at all any more). I will answer your question: The proposal is essentially a separate file just with a header for some metadata. If you want to separate metadata and the "payload" that would be possible, but of course it is nice to have it in one file. So there is no drawback from having a separate file except that well you then have two files instead of one (which you would also have to hand e.g. to the validator). In the end we can also have a zip-style format where all the witness information is in a zip with a MANIFEST containing the metadata and (potentially many) files describing verification conditions for input program(s).

The idea here for me was to have the simplest solution (Occam's razor). For achieving that goal, one file is better than many, unzipped is better than zipped. That being said conversion between a lot (if not all?) of the ideas brought forward here is super easy. I see response from many people that would welcome an easier format.

The effort then is two-fold: On the one side, validators should have some support for this. On the other side, verifiers should think about whether they could easily output this.

So if your verifier outputs correctness witnesses in one of the styles discussed here, translation to the other proposed variants is probably super easy. Same goes for the validators.

If this new format uses ACSL, it will have more expressiveness then the old format. So the conversion is not easy.

@MartinSpiessl : Thanks for the clarification.

If this new format uses ACSL, it will have more expressiveness then the old format. So the conversion is not easy.

Yes, I was talking about the conversion between the proposed ACSL-based formats amongst themselves, not between them and the current format, I should have made that clearer.

Side note: the current format could already be made to support ACSL expressions with a simple format extension (why didn't you do so years ago tbh? In CPAchecker there is even code that filters out ACSL expressions so I guess that was in the format at some point in the early days).

Yeah, Ultimate does the same ... I actually do not remember the whole reasoning, but one point was that it would be harder to accept for other tools that do not already support some parts of ACSL.