/arith64

__udivdi3(), __divdi3(), __umoddi3(), __moddi3(), etc. for embedded GCC

Primary LanguageCThe UnlicenseUnlicense

32- and 64-bit arithmetic functions for use with 32-bit GCC.

Some versions of 32-bit GCC may emit calls to external helper functions to
perform certain 32- and 64-bit operations. Normally these functions are
resolved by libgcc.a which is statically linked to the program.

But libgcc may not be usable in some applications, e.g. embedded systems and
linux kernel drivers. In those cases you'll get an linker error such as:

    undefined reference to `__divdi3'

The solution is to link arith64.c to your code, or just copy the required
functions.

Also see https://gcc.gnu.org/onlinedocs/gccint/Integer-library-routines.html.

Note not all operations have been implemented, just the ones I've required to
date.

====

'make' performs Monte-Carlo validation testing. Two test executables are
created, one linked to arith64.c and one to libgcc. test.py launches these into
the background and then passes identical sets of random numbers to each,
comparing their outputs. On mismatch it reports error and exits, otherwise it
will run forever, intermittently printing the total number of tests performed.

'make bench' starts the test executables in benchmark mode, they perform each
operation one million times and print the average elapsed time in nanoseconds.
test.py then collates the returned information and prints a running average
every 5 seconds, e.g.:

   286359M :  arith64      gcc
       abs :     2.74     2.68
      ashl :     2.49     2.69
      ashr :     2.87     2.74
      clzd :     2.95     1.63
      clzs :     2.60     1.33
      ctsz :     3.50     1.42
      ctzd :     3.57     1.65
       div :    16.19     8.11
       ffs :     2.69     1.64
       mod :    15.49     8.32
      popd :     6.31     3.59
      pops :     2.43     2.15
       shr :     2.39     2.69
      udiv :    13.04     5.64
      umod :    11.94     6.25

This shows that the gcc native functions, presumably written in assembly
language and leveraging CPU-specific arithmetic operators, are generally twice
as fast as arith64 equivalents.