/ub-canaries

collection of C/C++ programs that try to get compilers to exploit undefined behavior

Primary LanguageCMIT LicenseMIT

-------------------------------------------------------------------------------

UB Canaries: A collection of C/C++ programs that detect undefined
behavior exploitation by compilers.

-------------------------------------------------------------------------------

Each directory documents and tests an expectation: something a developer might
-- perhaps unreasonably! -- expect the compiler to do when faced with undefined
behavior. For example, the "uninitialized-variable" directory tests the
expectation that an uninitialized scalar variable will consistently return some
value that is legal for the variable's type.

If any test program in a directory does not display the expected behavior (for a
given compiler version + flags), then that compiler is considered to violate
that expectation.

The toplevel script run.pl tests the expectations for a specified collection of
compilers and compiler flags. For now, you need to edit that file to change the
set of compilers and flags.

The output of run.pl is, for each compiler / flag / canary combination, a bit
that tells us whether that compiler has been observed to exploit that particular
UB (in other words, whether it has been observed violating the expectation). So
here, for example, clang has been seen to exploit signed integer overflows and
uninitialized variables, but not signed left shifts:

clang -O3 signed-left-shift 0
clang -O3 signed-integer-overflow 1
clang -O3 uninitialized-variable 1

-------------------------------------------------------------------------------

Guidelines for writing tests:

- Each test program should be entirely contained (except for standard header
  files) in a single compilation unit.

- An expectation should only be tested by looking at a program's stdout, never
  by looking at its assembly code or observing its memory usage or execution
  time.

- A test program foo.c may have one or more outputs corresponding to the
  "expected" case where the compiler does not exploit that UB. If there is one
  such file it should be called foo.output. If there are multiple files they
  should be called foo.output1, foo.output2, etc. If the actual output does not
  match any of these files, the compiler is assumed to have exploited the UB.

- Every test program must test only a single UB. In other words, each test
  program is written in a dialect of C that is completely standard except that a
  single behavior (signed integer overflow or whateever) is actually defined
  instead of undefined.

- Reliance on implementation-defined behavior is unavoidable, but please avoid
  gratutious reliance such as assuming a particular size for int or long.

-------------------------------------------------------------------------------