sosy-lab/sv-witnesses

Witness linter rejects ISO 8601 times with decimal components

RyanGlScott opened this issue · 3 comments

Given the following witness file:

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
  <key id="witness-type" attr.name="witness-type" attr.type="string" for="graph" />
  <key id="sourcecodelang" attr.name="sourcecodelang" attr.type="string" for="graph" />
  <key id="producer" attr.name="producer" attr.type="string" for="graph" />
  <key id="specification" attr.name="specification" attr.type="string" for="graph" />
  <key id="programfile" attr.name="programfile" attr.type="string" for="graph" />
  <key id="programhash" attr.name="programhash" attr.type="string" for="graph" />
  <key id="architecture" attr.name="architecture" attr.type="string" for="graph" />
  <key id="creationtime" attr.name="creationtime" attr.type="string" for="graph" />
  <key id="entry" attr.name="entry" attr.type="boolean" for="node">
    <default>false</default>
  </key>
  <key id="sink" attr.name="sink" attr.type="boolean" for="node">
    <default>false</default>
  </key>
  <key id="violation" attr.name="violation" attr.type="boolean" for="node">
    <default>false</default>
  </key>
  <key id="invariant" attr.name="invariant" attr.type="string" for="node" />
  <key id="invariant.scope" attr.name="invariant.scope" attr.type="string" for="node" />
  <key id="assumption" attr.name="assumption" attr.type="string" for="edge" />
  <key id="assumption.scope" attr.name="assumption.scope" attr.type="string" for="edge" />
  <key id="assumption.resultfunction" attr.name="assumption.resultfunction" attr.type="string" for="edge" />
  <key id="control" attr.name="control" attr.type="string" for="edge" />
  <key id="startline" attr.name="startline" attr.type="int" for="edge" />
  <key id="endline" attr.name="endline" attr.type="int" for="edge" />
  <key id="startoffset" attr.name="startoffset" attr.type="int" for="edge" />
  <key id="endoffset" attr.name="endoffset" attr.type="int" for="edge" />
  <key id="enterLoopHead" attr.name="enterLoopHead" attr.type="boolean" for="edge">
    <default>false</default>
  </key>
  <key id="enterFunction" attr.name="enterFunction" attr.type="string" for="edge" />
  <key id="returnFromFunction" attr.name="returnFromFunction" attr.type="string" for="edge" />
  <key id="threadId" attr.name="threadId" attr.type="string" for="edge" />
  <key id="createThread" attr.name="createThread" attr.type="string" for="edge" />
  <key id="startcolumn" attr.name="startcolumn" attr.type="int" for="edge" />
  <graph edgedefault="directed">
    <data key="witness-type">violation_witness</data>
    <data key="sourcecodelang">C</data>
    <data key="producer">crux-llvm-0.4</data>
    <data key="specification">CHECK( init(main()), LTL(G ! overflow) )</data>
    <data key="programfile">../sv-benchmarks/c/signedintegeroverflow-regression/AdditionIntMax.i</data>
    <data key="programhash">17515b65d583c44dcb7e235e0e9227ff5f3e328b2b111df737a68b768cd4ce6e</data>
    <data key="architecture">64bit</data>
    <data key="creationtime">2021-09-16T02:25:47.80950425+02:00</data>
    <node id="N0">
      <data key="entry">true</data>
    </node>
    <node id="N1" />
    <node id="N2">
      <data key="violation">true</data>
    </node>
    <edge source="N0" target="N1">
      <data key="assumption.scope">main</data>
    </edge>
    <edge source="N1" target="N2">
      <data key="assumption.scope">main</data>
      <data key="startline">332</data>
      <data key="startcolumn">22</data>
    </edge>
  </graph>
</graphml>

The witness linter will reject its creationtime:

./witnesslinter.py --ignoreSelfLoops --witness ../crux-llvm-svcomp-2021-09-15-Linux-x86_64/results/crux.2021-09-16_02-25-33.files/SV-COMP21_no-overflow/AdditionIntMax.yml/witness.graphml ../sv-benchmarks/c/signedintegeroverflow-regression/AdditionIntMax.i
  

--------------------------------------------------------------------------------


WARNING : line 46: Invalid format for creationtime
WARNING : Creationtime has not been specified

witnesslint finished with exit code 1

As far as I can tell, 2021-09-16T02:25:47.80950425+02:00 is a valid ISO 8601 time. I think the witness linter is getting confued by the decimal part of the seconds, judging from how it is implemented:

CREATIONTIME_PATTERN = r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}([+-]\d{2}:\d{2})?$"

This assumes that the seconds will never be followed by a decimal component. I believe this does not conform to Section 4.2.2.4 of ISO 8601, which says:

If necessary for a particular application a decimal fraction of hour, minute or second may be included. If a decimal fraction is included, lower order time elements (if any) shall be omitted and the decimal fraction shall be divided from the integer part by the decimal sign specified in ISO 31-0, i.e. the comma [,] or full stop [.]. Of these, the comma is the preferred sign. If the magnitude of the number is less than unity, the decimal sign shall be preceded by two zeros in accordance with 3.6.

Yes, we could consider allowing this. Currently the README.md which is the normative document for the witness format states:

The time must be given in hours, minutes, and seconds, separated by colons (':').

So it does not only specify that ISO 8601 shall be used, but in addition also the components the time stamp shall comprise of.

In case we extend this to allow decimals/cange the regex, we would also need to update the README.md to make this explicit

@MartinSpiessl Can we simply require ISO 8601, without anything in addition? Was human readability the only argument to require integer seconds?

I don't know what the argument for requiring integer seconds was, as this was already the case before I joined.

In practice I could imagine the decimal fractions of a second are not that important (nobody will manually re-execute a tool within a second, but someone might within a minute. In those cases the timestamp would potentially be needed to differentiate between the witnesses if they got mixed up).

A case could be made that it is easier to read in if the format is fixed, i.e., either always include the decimals or not. I am not sentimental about the current definition of the allowed time stamp values. But tools might break/refuse a witness if the time format is not to the (old) spec. One would need to test this with the current validators and tools that read in witnesses that are know to us(but that should be quick). In case nothing breaks it is more or less already safe to make a format extension that optionally allows the decimals. We could also just make the change now and discover what is broken for free during the competition 😉

Either way increasing the format version (to 1.1 or so) is probably a good idea then. A tool that can read version 1.0 is not designed to understand a witness generated with format version 1.0 is not guaranteed to understand a tool with format version 1.1, though it is always possible to write a witness in version 1.1 that is also compliant with version 1.0.