sosy-lab/sv-benchmarks

Use unpreprocessed files in task definitions

MartinSpiessl opened this issue · 2 comments

This is part 2 of two closely related issues, the first one (#1123) building the base for implementing this one.

We could try to allow for unpreprocessed program files to be used in task definition files and expect the tools to preprocess the files according to the headers that we would specify in the SV-COMP rules. The necessary prequisites for this would be established by #1123. The overhead in execution time arising from the preprocessing should be negligible.

The yml files would of course also have to be adapted to refer to the .c files instead of the .i files. They would also be somewhat underspecified since the headers are only fixed by our convention. We could however also add the headers to the task definition,the format even allows wildcards such that we can add a while folder with headers with just one include/*.h

The tools might then naturally be extended with a --includedir option where the user can specify the headers to be used for preprocessing. This is a cool feature in itself because currently the verifiers that can already preprocessed C files probably just take the system default headers, which might actually not be the ones that will be used for the actual compilation of the program (I am thinking about embedded systems as an example here). This preprocessor step could of course also just be done by a small tool that is provided by us and wrapped around the actual tool execution.

I would say the dependency between your two issues is different: if we use the unpreprocessed sources in tasks, then we do not need to care about preprocessing and can remove the preprocessed sources, so #1123 would be irrelevant.

In the SV-COMP jury meeting on 2018-04-19 one team asked whether they could participate using the unpreprocessed sources. The response to that was that in principle we could just give all the tools the unpreprocessed sources, tools can preprocess them themselves if necessary. There was general agreement on this topic and it was mentioned that this had been voted on in the past. I believe #765 and related where a part of the subsequent effort to get preprocessed files and sources to match.

Note that it is clear that preprocessed source would not go away completely, because there are many tasks that are available only in preprocessed form and where there is no chance to change this (e.g., LDV tasks).

In the meeting it was also claimed by @dbeyer that there is still infrastructure work necessary in order to be able to do this, but I guess this has been solved by now? (Probably the task-definition files were meant, though I am not sure.)

As a first step it was mentioned in the meeting to remove the preprocessed source for those tasks where the unpreprocessed sources exist and match. I would extend this with the restriction to do it for sources with a single C file that include only system headers first, and then later extend this to more complex sources.

Furthermore, we might want to add a flag to the task definition that specifies whether the file is already preprocessed or whether preprocessing is necessary.

For something like --includedir instead of using system headers, please keep this separate. I suspect that this is a much more complex topic. AFAIK on a standard Linux system there is no single directory with the "system headers", but these are spread over several directories.

For headers in task-definition files, system headers would of course not be listed but assumed to be installed. Locally present headers that are included by the main file also should not be added to input_files because otherwise tools would get these headers as additional input files on the command line, which would confuse them. But such headers should be added to required_files, which is a key that is not listed in the file-format description in this repo, but which is actually supported by BenchExec for these files (definition) and adding the headers there is necessary for the competition infrastructure.