cjlano/tinyprintf

tests fail on Arch Linux 32-bit

Closed this issue · 3 comments

As mentionned with @DavidDecotigny in pull request #7 some tests fail with my setup with -O3 optimization (works fine without or with lower optimization level)

$ uname -a
Linux hpmini 3.15.8-1-ARCH #1 SMP PREEMPT Fri Aug 1 08:53:42 CEST 2014 i686 GNU/Linux
$ cat /proc/cpuinfo 
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 54
model name  : Intel(R) Atom(TM) CPU N2600   @ 1.60GHz
stepping    : 1
microcode   : 0x10c
cpu MHz     : 600.000
cache size  : 512 KB
physical id : 0
siblings    : 4
core id     : 0
cpu cores   : 2
apicid      : 0
initial apicid  : 0
fdiv_bug    : no
f00f_bug    : no
coma_bug    : no
fpu     : yes
fpu_exception   : yes
cpuid level : 10
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon pebs bts nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 xtpr pdcm movbe lahf_lm arat dtherm
bogomips    : 3193.94
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 32 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model       : 54
model name  : Intel(R) Atom(TM) CPU N2600   @ 1.60GHz
stepping    : 1
microcode   : 0x10c
cpu MHz     : 600.000
cache size  : 512 KB
physical id : 0
siblings    : 4
core id     : 0
cpu cores   : 2
apicid      : 1
initial apicid  : 1
fdiv_bug    : no
f00f_bug    : no
coma_bug    : no
fpu     : yes
fpu_exception   : yes
cpuid level : 10
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon pebs bts nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 xtpr pdcm movbe lahf_lm arat dtherm
bogomips    : 3193.94
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 32 bits virtual
power management:

processor   : 2
vendor_id   : GenuineIntel
cpu family  : 6
model       : 54
model name  : Intel(R) Atom(TM) CPU N2600   @ 1.60GHz
stepping    : 1
microcode   : 0x10c
cpu MHz     : 600.000
cache size  : 512 KB
physical id : 0
siblings    : 4
core id     : 1
cpu cores   : 2
apicid      : 2
initial apicid  : 2
fdiv_bug    : no
f00f_bug    : no
coma_bug    : no
fpu     : yes
fpu_exception   : yes
cpuid level : 10
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm movbe lahf_lm arat dtherm
bogomips    : 3193.94
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 3
vendor_id   : GenuineIntel
cpu family  : 6
model       : 54
model name  : Intel(R) Atom(TM) CPU N2600   @ 1.60GHz
stepping    : 1
microcode   : 0x10c
cpu MHz     : 600.000
cache size  : 512 KB
physical id : 0
siblings    : 4
core id     : 1
cpu cores   : 2
apicid      : 3
initial apicid  : 3
fdiv_bug    : no
f00f_bug    : no
coma_bug    : no
fpu     : yes
fpu_exception   : yes
cpuid level : 10
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm movbe lahf_lm arat dtherm
bogomips    : 3193.94
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:
$ git log -1 --oneline 
7523da0 let gcc warn when printf params types don't match format
$ make RUNNER=valgrind clean all
rm -f *.o test/*.o *~ test/*~ test/printf test/sprintf
cc -Wall -W -Wextra -O3 -g -I.  -c \
  -DTINYPRINTF_DEFINE_TFP_PRINTF=0 \
  -DTINYPRINTF_DEFINE_TFP_SPRINTF=0 \
  -DTINYPRINTF_OVERRIDE_LIBC=0 \
  -o tinyprintf_minimal.o tinyprintf.c
cc -Wall -W -Wextra -O3 -g -I.  -c \
  -DTINYPRINTF_DEFINE_TFP_PRINTF=1 \
  -DTINYPRINTF_DEFINE_TFP_SPRINTF=0 \
  -DTINYPRINTF_OVERRIDE_LIBC=0 \
  -o tinyprintf_only_tfp_printf.o tinyprintf.c
cc -Wall -W -Wextra -O3 -g -I.  -c \
  -DTINYPRINTF_DEFINE_TFP_PRINTF=0 \
  -DTINYPRINTF_DEFINE_TFP_SPRINTF=1 \
  -DTINYPRINTF_OVERRIDE_LIBC=0 \
  -o tinyprintf_only_tfp_sprintf.o tinyprintf.c
cc -Wall -W -Wextra -O3 -g -I.  -c -o test/printf.o test/printf.c
test/printf.c: In function ‘stdout_putf’:
test/printf.c:15:31: warning: unused parameter ‘unused’ [-Wunused-parameter]
 static void stdout_putf(void *unused, char c)
                               ^
test/printf.c: In function ‘main’:
test/printf.c:39:3: warning: left shift count >= width of type
   TPRINTF("%llu\n", (1ULL << 64) - 1);
   ^
test/printf.c:39:3: warning: left shift count >= width of type
test/printf.c:41:3: warning: left shift count >= width of type
   TPRINTF("%llx\n", (1ULL << 64) - 1);
   ^
test/printf.c:41:3: warning: left shift count >= width of type
cc -Wall -W -Wextra -O3 -g -I.  -c -o tinyprintf.o tinyprintf.c
cc -Wall -W -Wextra -O3 -g -I.   -o test/printf test/printf.o tinyprintf.o
cc -Wall -W -Wextra -O3 -g -I.  -c -o test/sprintf.o test/sprintf.c
test/sprintf.c: In function ‘main’:
test/sprintf.c:54:3: warning: left shift count >= width of type
   TPRINTF("%llu", (1ULL << 64) - 1);
   ^
test/sprintf.c:54:3: warning: left shift count >= width of type
test/sprintf.c:54:3: warning: left shift count >= width of type
test/sprintf.c:54:3: warning: left shift count >= width of type
test/sprintf.c:54:3: warning: left shift count >= width of type
test/sprintf.c:54:3: warning: left shift count >= width of type
test/sprintf.c:56:3: warning: left shift count >= width of type
   TPRINTF("%llx", (1ULL << 64) - 1);
   ^
test/sprintf.c:56:3: warning: left shift count >= width of type
test/sprintf.c:56:3: warning: left shift count >= width of type
test/sprintf.c:56:3: warning: left shift count >= width of type
test/sprintf.c:56:3: warning: left shift count >= width of type
test/sprintf.c:56:3: warning: left shift count >= width of type
cc -Wall -W -Wextra -O3 -g -I.   -o test/sprintf test/sprintf.o tinyprintf.o
set -x ; for prg in test/printf test/sprintf ; do valgrind $prg || exit $? ; done
+ for prg in test/printf test/sprintf
+ valgrind test/printf
==31026== Memcheck, a memory error detector
==31026== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==31026== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==31026== Command: test/printf
==31026== 
==31026== Warning: client switching stacks?  SP change: 0xbeee4880 --> 0x4022f9c
==31026==          to suppress, use: --max-stackframe=1158932252 or greater
==31026== Warning: client switching stacks?  SP change: 0x4022f9c --> 0xbeee4880
==31026==          to suppress, use: --max-stackframe=1158932252 or greater
Fun with printf and %!
libc_printf("d1=%016llx d2=%016lx d3=%02x d4=%02X 42=%03d\n", (long long unsigned)0xd1, (long unsigned)0xd2, 0xd3, 0xd4, 42) -> d1=00000000000000d1 d2=00000000000000d2 d3=d3 d4=D4 42=042
 tfp_printf("d1=%016llx d2=%016lx d3=%02x d4=%02X 42=%03d\n", (long long unsigned)0xd1, (long unsigned)0xd2, 0xd3, 0xd4, 42) -> d1=00000000000000d1 d2=00000000000000d2 d3=d3 d4=D4 42=042
libc_printf("d1=%04x d2=%06x d3=%08x %%100\n", 0xd1, 0xd2, 0xd3) -> d1=00d1 d2=0000d2 d3=000000d3 %100
 tfp_printf("d1=%04x d2=%06x d3=%08x %%100\n", 0xd1, 0xd2, 0xd3) -> d1=00d1 d2=0000d2 d3=000000d3 %100
libc_printf("|%-14s| |%-16s| d2=%2x |%-30s|\n", "str14", "str16", 0xd2, "12345678901234567890123456789012345") -> |str14         | |str16           | d2=d2 |12345678901234567890123456789012345|
 tfp_printf("|%-14s| |%-16s| d2=%2x |%-30s|\n", "str14", "str16", 0xd2, "12345678901234567890123456789012345") -> |str14         | |str16           | d2=d2 |12345678901234567890123456789012345|
libc_printf("|%4s|\n", "string4") -> |string4|
 tfp_printf("|%4s|\n", "string4") -> |string4|
libc_printf("|%-4s|\n", "string4") -> |string4|
 tfp_printf("|%-4s|\n", "string4") -> |string4|
libc_printf("42=%3d d1=%4.4x |%4s| d2=%8.8x\n", 42, 0xd1, "string4", 0xd2) -> 42= 42 d1=00d1 |string4| d2=000000d2
 tfp_printf("42=%3d d1=%4.4x |%4s| d2=%8.8x\n", 42, 0xd1, "string4", 0xd2) -> 42= 42 d1=00d1 |string4| d2=000000d2
libc_printf("42=%3d d1=%4.4x |%-4s| d2=%8.8x\n", 42, 0xd1, "string4", 0xd2) -> 42= 42 d1=00d1 |string4| d2=000000d2
 tfp_printf("42=%3d d1=%4.4x |%-4s| d2=%8.8x\n", 42, 0xd1, "string4", 0xd2) -> 42= 42 d1=00d1 |string4| d2=000000d2
libc_printf("84=%d 21=%ds |%s| |%sOK| d1=%x d2=%#x\n", 84, 21, "hello", "fine", 0xd1, 0xd2) -> 84=84 21=21s |hello| |fineOK| d1=d1 d2=0xd2
 tfp_printf("84=%d 21=%ds |%s| |%sOK| d1=%x d2=%#x\n", 84, 21, "hello", "fine", 0xd1, 0xd2) -> 84=84 21=21s |hello| |fineOK| d1=d1 d2=0xd2
libc_printf("%lld\n", -(1ULL << 63)) -> -9223372036854775808
 tfp_printf("%lld\n", -(1ULL << 63)) -> -0
libc_printf("%llu\n", (1ULL << 64) - 1) -> 18446744073709551615
 tfp_printf("%llu\n", (1ULL << 64) - 1) -> 18446744073709551615
libc_printf("%llx\n", -(1ULL << 63)) -> 8000000000000000
 tfp_printf("%llx\n", -(1ULL << 63)) -> 8000000000000000
libc_printf("%llx\n", (1ULL << 64) - 1) -> ffffffffffffffff
 tfp_printf("%llx\n", (1ULL << 64) - 1) -> ffffffffffffffff
libc_printf("d1=%.1x\n", 0xd1) -> d1=d1
 tfp_printf("d1=%.1x\n", 0xd1) -> d1=d1
libc_printf("d1=%4.1x\n", 0xd1) -> d1=  d1
 tfp_printf("d1=%4.1x\n", 0xd1) -> d1=00d1
libc_printf("d1=%4.x\n", 0xd1) -> d1=  d1
 tfp_printf("d1=%4.x\n", 0xd1) -> d1=00d1
libc_printf("a=%zd\n", sizeof(blah)) -> a=256
 tfp_printf("a=%zd\n", sizeof(blah)) -> a=256
libc_printf("a=%zu\n", sizeof(blah)) -> a=256
 tfp_printf("a=%zu\n", sizeof(blah)) -> a=256
libc_printf("a=%zi\n", sizeof(blah)) -> a=256
 tfp_printf("a=%zi\n", sizeof(blah)) -> a=256
libc_printf("a=0x%zx\n", sizeof(blah)) -> a=0x100
 tfp_printf("a=0x%zx\n", sizeof(blah)) -> a=0x100
libc_printf("Adddress of main: %p\n", main) -> Adddress of main: 0x8048350
 tfp_printf("Adddress of main: %p\n", main) -> Adddress of main: 0x8048350
libc_printf("Adddress of stack variable: %p\n", &in_stack) -> Adddress of stack variable: 0xbeee4a1c
 tfp_printf("Adddress of stack variable: %p\n", &in_stack) -> Adddress of stack variable: 0xbeee4a1c
==31026== 
==31026== HEAP SUMMARY:
==31026==     in use at exit: 0 bytes in 0 blocks
==31026==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==31026== 
==31026== All heap blocks were freed -- no leaks are possible
==31026== 
==31026== For counts of detected and suppressed errors, rerun with: -v
==31026== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
+ for prg in test/printf test/sprintf
+ valgrind test/sprintf
==31027== Memcheck, a memory error detector
==31027== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==31027== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==31027== Command: test/sprintf
==31027== 
==31027== Warning: client switching stacks?  SP change: 0xbeb46880 --> 0x4022f9c
==31027==          to suppress, use: --max-stackframe=1162725148 or greater
==31027== Warning: client switching stacks?  SP change: 0x4022f9c --> 0xbeb46880
==31027==          to suppress, use: --max-stackframe=1162725148 or greater
Fun with sprintf and %!
libc_snprintf("d1=%016llx d2=%016lx d3=%02x d4=%02X 42=%03d", (long long unsigned)0xd1, (long unsigned)0xd2, 0xd3, 0xd4, 42) -> 58: d1=00000000000000d1 d2=00000000000000d2 d3=d3 d4=D4 42=042
 tfp_snprintf("d1=%016llx d2=%016lx d3=%02x d4=%02X 42=%03d", (long long unsigned)0xd1, (long unsigned)0xd2, 0xd3, 0xd4, 42) -> 58: d1=00000000000000d1 d2=00000000000000d2 d3=d3 d4=D4 42=042
libc_snprintf("d1=%04x d2=%06x d3=%08x %%100", 0xd1, 0xd2, 0xd3) -> 34: d1=00d1 d2=0000d2 d3=000000d3 %100
 tfp_snprintf("d1=%04x d2=%06x d3=%08x %%100", 0xd1, 0xd2, 0xd3) -> 34: d1=00d1 d2=0000d2 d3=000000d3 %100
libc_snprintf("|%-14s| |%-16s| d2=%2x |%-30s|", "str14", "str16", 0xd2, "12345678901234567890123456789012345") -> 79: |str14         | |str16           | d2=d2 |12345678901234567890123456789012345|
 tfp_snprintf("|%-14s| |%-16s| d2=%2x |%-30s|", "str14", "str16", 0xd2, "12345678901234567890123456789012345") -> 79: |str14         | |str16           | d2=d2 |12345678901234567890123456789012345|
libc_snprintf("|%4s|", "string4") -> 9: |string4|
 tfp_snprintf("|%4s|", "string4") -> 9: |string4|
libc_snprintf("|%-4s|", "string4") -> 9: |string4|
 tfp_snprintf("|%-4s|", "string4") -> 9: |string4|
libc_snprintf("42=%3d d1=%4.4x |%4s| d2=%8.8x", 42, 0xd1, "string4", 0xd2) -> 36: 42= 42 d1=00d1 |string4| d2=000000d2
 tfp_snprintf("42=%3d d1=%4.4x |%4s| d2=%8.8x", 42, 0xd1, "string4", 0xd2) -> 36: 42= 42 d1=00d1 |string4| d2=000000d2
libc_snprintf("42=%3d d1=%4.4x |%-4s| d2=%8.8x", 42, 0xd1, "string4", 0xd2) -> 36: 42= 42 d1=00d1 |string4| d2=000000d2
 tfp_snprintf("42=%3d d1=%4.4x |%-4s| d2=%8.8x", 42, 0xd1, "string4", 0xd2) -> 36: 42= 42 d1=00d1 |string4| d2=000000d2
libc_snprintf("84=%d 21=%ds |%s| |%sOK| d1=%x d2=%#x", 84, 21, "hello", "fine", 0xd1, 0xd2) -> 43: 84=84 21=21s |hello| |fineOK| d1=d1 d2=0xd2
 tfp_snprintf("84=%d 21=%ds |%s| |%sOK| d1=%x d2=%#x", 84, 21, "hello", "fine", 0xd1, 0xd2) -> 43: 84=84 21=21s |hello| |fineOK| d1=d1 d2=0xd2
libc_snprintf("%lld", -(1ULL << 63)) -> 20: -9223372036854775808
 tfp_snprintf("%lld", -(1ULL << 63)) -> 2: -0
sprintf: test/sprintf.c:53: main: Assertion `sz1 == sz2' failed.
==31027== 
==31027== HEAP SUMMARY:
==31027==     in use at exit: 2,048 bytes in 2 blocks
==31027==   total heap usage: 52 allocs, 50 frees, 18,870 bytes allocated
==31027== 
==31027== LEAK SUMMARY:
==31027==    definitely lost: 0 bytes in 0 blocks
==31027==    indirectly lost: 0 bytes in 0 blocks
==31027==      possibly lost: 0 bytes in 0 blocks
==31027==    still reachable: 2,048 bytes in 2 blocks
==31027==         suppressed: 0 bytes in 0 blocks
==31027== Rerun with --leak-check=full to see details of leaked memory
==31027== 
==31027== For counts of detected and suppressed errors, rerun with: -v
==31027== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
/bin/sh: line 1: 31027 Aborted                 valgrind $prg
+ exit 134
Makefile:13: recipe for target 'runtests' failed
make: *** [runtests] Error 134
$ file test/*printf 
test/printf:  ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=4c13777bc870c07092c9f420051e002bea8824ec, not stripped
test/sprintf: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=0253abdd506a917571d9ea39f15a34c7800ecfd3, not stripped

I finally got a gcc 4.9 working, only 64b unfortunately. But I confirm there is a problem here too. For my tests, this is the following in ulli2a which doesn't work with -O3 and num = ULLONG_MAX:
while (num / d >= p->base)
d *= p->base;
It's as if gcc was thinking that this is negative (ie. demotion from unsigned to signed), and hence < p->base, so we end up with d == 1 instead of 100...000.

I tried various casts, gcc sticks to its "optimizations".

As a bad hack, replacing the while with:
while ( ((num / d) >= p->base) && (num >= d*p->base) )
"fixes" the issue for the extreme cases.

But IMHO this is not the correct fix, as it could have several corner cases.

The assembly suggests that gcc is inlining these functions and specializing them of base 10, 16, etc. This might cause our problems.

Will need to work on this more. For now, this looks like a nasty issue with signed/unsigned handling by gcc. I am wondering if there is a bug opened on gcc for this.

Could you please confirm that d == 1 at the end of the 1st while with %llu ULLONG_MAX?

could you please take a log at pull request #9 and let me know if this fixes the issue for you? If it does, please merge.

Fixed by #9
Great work! Thanks!