zstd
Opened this issue · 27 comments
- https://github.com/facebook/zstd/ at
7543085013db1a20a848d166e5931edc49e3cc2f
compiledb
succeeds- Fails as Goblint lacks support for some Intel SIMD things defined in
xmmintrin.h
:
/usr/lib/gcc/x86_64-linux-gnu/7/include/xmmintrin.h:120: Unimplemented: doInit: unexpected NEXT_INIT for float __attribute__((__vector_size__(16),__may_alias__))
- Ignore intrinsics and build with
make zstd
diff --git a/programs/Makefile b/programs/Makefile
index 16763e49..8ff545b8 100644
--- a/programs/Makefile
+++ b/programs/Makefile
@@ -138,7 +138,7 @@ allVariants: zstd zstd-compress zstd-decompress zstd-small zstd-frugal zstd-nole
zstd : CPPFLAGS += $(THREAD_CPP) $(ZLIBCPP) $(LZMACPP) $(LZ4CPP)
zstd : LDFLAGS += $(THREAD_LD) $(DEBUGFLAGS_LD)
zstd : LDLIBS += $(ZLIBLD) $(LZMALD) $(LZ4LD)
-zstd : CPPFLAGS += -DZSTD_LEGACY_SUPPORT=$(ZSTD_LEGACY_SUPPORT)
+zstd : CPPFLAGS += -DZSTD_LEGACY_SUPPORT=$(ZSTD_LEGACY_SUPPORT) -DZSTD_NO_INTRINSICS
ifneq (,$(filter Windows%,$(OS)))
zstd : $(RES_FILE)
endif
- Need to comment the contents of
includes/assert.h
of Goblint for the parsing to succeed?
time ./goblint ../../bench-repos/zstd --disable ana.base.context.non-ptr --disable sem.unknown_function.spawn --set ana.thread.domain plain --enable exp.earlyglobs --set ana.base.privatization none --enable dbg.print_dead_code --enable justcil --disable include_stdlib --set cppflags[+] -DZSTD_NO_INTRINSICS -v &> zstd.cil.c
real 0m5,836s
user 0m5,495s
sys 0m0,308s
So parsing and everything is also non-horrible here. The program does contain threads.
With ./goblint zstd.cil.c --disable ana.base.context.non-ptr --disable sem.unknown_function.spawn --set ana.thread.domain plain --enable exp.earlyglobs --set ana.base.privatization none --enable dbg.print_dead_code &> zstd.cil.out -v
Summary for all memory locations:
safe: 2388
vulnerable: 14
unsafe: 16
-------------------
total: 2418
vars = 170735 evals = 432968
Timings:
TOTAL 224.473 s
parse 0.544 s
convert to CIL 0.694 s
analysis 223.237 s
global_inits 1.085 s
solving 221.102 s
S.Dom.equal 0.712 s
postsolver 52.939 s
warn_global 0.131 s
Timing used
Memory statistics: total=296341.32MB, max=934.02MB, minor=296304.66MB, major=2087.03MB, promoted=2050.37MB
minor collections=141306 major collections=28 compactions=0
Found dead code on 3427 lines (including 3087 in uncalled functions)!
Total lines (logical LoC): 21415
Ah, and it requires goblint/analyzer#557 again
It works non-incrementally, but when I try to run it (on master
) incrementally (without any changes to the code) I get
Fatal error: exception Invalid_argument("List.iter2")
Raised at Stdlib.invalid_arg in file "stdlib.ml", line 30, characters 20-45
Called from UpdateCil.update_ids.reset_fun in file "src/incremental/updateCil.ml", line 46, characters 4-70
Called from Stdlib__list.iter in file "list.ml", line 110, characters 12-15
Called from UpdateCil.update_ids in file "src/incremental/updateCil.ml", line 126, characters 2-43
Called from Maingoblint.diff_and_rename in file "src/maingoblint.ml", line 457, characters 22-92
Called from Maingoblint.main in file "src/maingoblint.ml", line 509, characters 110-130
Called from Stdlib.at_exit.new_exit in file "stdlib.ml", line 554, characters 59-63
Called from Stdlib.do_at_exit in file "stdlib.ml" (inlined), line 560, characters 20-61
Called from Std_exit in file "std_exit.ml", line 18, characters 8-20
@stilscher said she'd take a look. May be related to goblint/analyzer#574
The stacktrace looks very similar to the issue I accidentally happened upon indeed: goblint/analyzer#542 (comment). Although actually it doesn't stem from there being a different number of statements, but rather locals:
List.iter2 (fun l o_l -> l.vid <- o_l.vid) f.slocals old_f.slocals;
We managed to get the incremental analysis to work with the additional option --set cppflags[+] -D_FORTIFY_SOURCE=0
.
The problem were multiple functions with the name realpath
in the CIL file. The comparison and updating of ids for the incremental analysis assume that the functions in the CIL file have unique names. The realpath
definition can be found in the /usr/include/x86_64-linux-gnu/bits/stdlib.h
header. The mentioned header is only included when the fortify level is greater than 0. We have not looked into, why multiple realpath
functions exist in the CIL file. It might be related to the merging errors I found in #8. However, analyzing with the above mentioned flag could be a reasonable approach for now.
But there are also duplicate global variables in the CIL file at which one needs to look.
A not too rigorous configuration that worked well for an analysis (took 5-6min for the non-incremental run) is:
./goblint ../test-repos/zstd/ --disable sem.unknown_function.spawn --enable exp.earlyglobs --set ana.base.privatization none --enable incremental.save --enable dbg.print_dead_code --set cppflags[+] -DZSTD_NO_INTRINSICS --set cppflags[+] -D_FORTIFY_SOURCE=0 &> zstd.cil.out -v
Additionally adding intervals led to a steadily increasing number of contexts and I aborted after ~30min.
In the incremental run (without any changes) a big amount of time was needed for the postsolving. So it would be interesting to see how long the incremental analysis (without and with changes) takes when including the incremental postsolving from @sim642 that is currently on the interactive branch.
You might also want to use -DGOBLINT_NO_BSEARCH
to remove some merging conflicts.
If we want to get the runtime a bit higher, we could also try:
- enabling more int domains (e.g. congruence or enums)
- enabling intervals and not them into contexts
Also, instead of deleting the contents of assert.h
, it suffices to set -DGOBLINT_NO_ASSERT
as an additional preprocessor flag.
After bumping goblint-cil, we now have zero CIL warnings when working with this repo 💪
For the incremental run on zstd even on a program without any changes, the solving
was reported to take around 3 seconds, out of ~8-12 seconds total. Which is noteworthy because ideally the solving should take ~0 seconds for unchanged code.
Yesterday we, @stilscher @michael-schwarz and I, looked further into this. It turns out that most of this incremental solving time, around >2,7 seconds, was spent on the hascons-relifting. If one disables hashconsing, the overall runtime for a non-incremental run is somewhat degraded, but not too much (roughly from 131 seconds to 138 seconds). There was some significant on memory usage though, IIRC. @stilscher has the precise numbers.
There is no easy fix to get rid of the relifting in the non-server incremental mode, as weak hashtables used for hashconsing cannot be marshaled (at least not directly). For the server mode, one could probably get away with skipping the relifting in the solver, but having to run experiments in server mode might make things more complicated.
One might just prefer to use the incremental analysis without hash-consing.
Here are the numbers that we collected for a non-incremental run with hashconsing:
TOTAL 131.208 s
parse 0.497 s
convert to CIL 0.683 s
analysis 130.029 s
createCFG 0.502 s
global_inits 0.671 s
solving 126.195 s
vs 0.001 s
Sol'.solve 125.932 s
relift 0.000 s
S.Dom.equal 0.414 s
postsolver 29.456 s
split_solution 0.263 s
warn_global 0.063 s
access 0.050 s
Timing used
Memory statistics: total=497409.75MB, max=1878.66MB, minor=497216.06MB, major=3929.75MB, promoted=3736.05MB
minor collections=237121 major collections=28 compactions=0
and without hashconsing
TOTAL 138.896 s
parse 0.535 s
convert to CIL 0.698 s
analysis 137.664 s
createCFG 0.499 s
global_inits 0.076 s
solving 129.479 s
vs 0.001 s
Sol'.solve 128.823 s
relift 0.000 s
S.Dom.equal 4.364 s
postsolver 41.818 s
split_solution 0.657 s
warn_global 0.086 s
access 0.068 s
Timing used
Memory statistics: total=486846.38MB, max=2484.53MB, minor=486616.93MB, major=4642.06MB, promoted=4412.62MB
minor collections=232065 major collections=28 compactions=0
I tried adding "ZSTD_customMalloc","ZSTD_customCalloc","POOL_create"
as malloc wrappers in a hope to get the number of race warnings down, but that did not help.
If inlines merging is enabled again, aren't all the "is used for two distinct globals" warnings just due to the silly way zstd Makefile uses relative paths? If you read the warnings, they're all about mem.h
, but referenced in different ways:
../lib/common/mem.h
../lib/legacy/../common/mem.h
../lib/dictBuilder/../common/mem.h
../lib/decompress/../common/mem.h
../lib/compress/../common/mem.h
Yesterday I looked into the data races Goblint finds in zstd. Besides the valid data race on threadLimit
as pointed out here, there's two kinds of spurious races we still find if we additionally activate symb_locks (but not var_eq!):
- In
POOL_create_advanced
there are accesses to a freshly allocated struct before the threads are created in a loop. A freshness/thread-locality/escape analysis should hopefully be able to rule these out. - Accesses in functions like
POOL_add_internal
, which are called from functions that always call them with a lock held (POOL_add
). Ideally we'd have a symbolic lock held there, but var_eq is required to preserve the symbolic lock from the outer function as the pointers get passed in via arguments to the inner function.
Unfortunately, also enabling var_eq causes the analysis to not terminate in at least 40 minutes:
runtime: 00:40:23.881
vars: 2497599, evals: 3116757
max updates: 32 for var node 4469 "ret = (FIO_prefs_t *)((FIO_prefs_t *)tmp);" on programs/fileio.c:229:5-229:71
|rho|=2497599
|called|=325
|stable|=2465598
|infl|=2497070
|wpoint|=232
|side_dep|=0
|side_infl|=0
Found 260457 contexts for 1193 functions. Top 5 functions:
24901 contexts for entry state of MEM_readST___17 (172569) on lib/common/mem.h:212:1-212:115
18684 contexts for entry state of MEM_isLittleEndian___18 (172565) on lib/common/mem.h:158:1-176:1
18583 contexts for entry state of MEM_read32___18 (172567) on lib/common/mem.h:210:1-210:110
17048 contexts for entry state of MEM_64bits___18 (172564) on lib/common/mem.h:156:1-156:95
11366 contexts for entry state of ZSTD_NbCommonBytes___4 (172783) on lib/compress/zstd_compress_internal.h:698:1-792:1
Memory statistics: total=5314342.41MB, max=7600.24MB, minor=5313863.33MB, major=40634.39MB, promoted=40155.32MB
minor collections=2533967 major collections=70 compactions=0
With just symb_locks, it terminates in just 5 minutes:
runtime: 00:04:56.073
vars: 258037, evals: 797082
max updates: 57 for var node 2417 "flNb ++;" on programs/zstdcli.c:1249:9-1256:9
|rho|=258037
|called|=124
|stable|=236877
|infl|=257600
|wpoint|=1858
|side_dep|=0
|side_infl|=0
Found 12771 contexts for 1709 functions. Top 5 functions:
64 contexts for entry state of longCommandWArg (169760) on programs/zstdcli.c:428:1-434:1
58 contexts for entry state of FSE_readNCount_bmi2 (24917) on lib/common/entropy_common.c:234:1-245:1
58 contexts for entry state of FSE_readNCount_body (170219) on lib/common/entropy_common.c:69:1-215:1
58 contexts for entry state of FSE_readNCount_body_default (170220) on lib/common/entropy_common.c:218:1-223:1
57 contexts for entry state of BITv05_reloadDStream (175604) on lib/legacy/zstd_v05.c:857:1-884:1
Memory statistics: total=838726.42MB, max=1074.12MB, minor=838637.35MB, major=4296.13MB, promoted=4207.05MB
minor collections=399930 major collections=37 compactions=0
Notably, var_eq causes an extreme number of contexts to appear compared to without it, so that's probably the source of non-termination. I haven't yet tried looking into what variable equalities end up collecting into those contexts.
Notably, var_eq causes an extreme number of contexts to appear compared to without it, so that's probably the source of non-termination. I haven't yet tried looking into what variable equalities end up collecting into those contexts.
The issue is avoided using --set ana.ctx_insens[+] var_eq
. Then the analysis terminates reasonably and the previously problematic inner accesses get the required symbolic lock.
By the way, the bigger-benchmarks confs don't declare ZSTD_customMalloc
and ZSTD_customCalloc
as malloc wrappers, so all the allocations are joined into a single constraint unknown with a supertop value.
So any pointers written to allocated structs are just forgotten and wouldn't get called later when dereferenced. This could have big implications for soundness...
Using goblint/analyzer#690, it's actually possible to get rid of all the spurious races now.
After all the unsoundness saga and subsequent precision improvement attempt, it's revealed that zstd has some intricate race-freedom reasons, which Goblint is far from being able to handle: goblint/analyzer#707 (comment).
I am getting this error when trying to run with efficiency.py: Fatal error: exception Failure("No suitable function to start from.")
This is the config file i am using: https://github.com/TimOrtel/analyzer/blob/175cdf05c2f4209503bf02e81ae7a06259e89e73/conf/min_incr_zstd.json
Am I missing some kind of config option?
The most likely culprit would be that the compilation database is not generated correctly. Have you checked the contents of the compile_commands.json
?
Compile commands is empty: []
prepare.log has the following lines:
==> building with threading support ==> building zstd with .gz compression support ==> no liblzma, building zstd without .xz/.lzma support ==> no liblz4, building zstd without .lz4 support LINK obj/conf_....../zstd zstd build completed
You probably need to do make clean
and recreate the compilation database. If make
has everything cached and does nothing, then the compilation database will not contain anything.
This cannot be the case, because the script efficency.py clones zstd from git on every execution. Therefore make is executed every time on a fresh zstd folder.
I think @stilscher is the one best equipped to help you here, she did the benchmarking for zstd for our last submission.
I was not able to reconstruct the problem. I ran python3 efficiency.py /absolute/path/to/the/analyzer 1
on the branch on your fork that contains the scripts (I fixed the make error by substituting Seq.exists
in detectRenamedFunctionsRecursive.ml
because it does not exist in the included module). prepare.log
does not contain any error messages, a compilation database is created and a main method found and analyzed.
Which compiledb version are you using?
Perhaps I am using the wrong compiledb. I installed mine through pip. The version is 0.10.1. Is there another compiledb tool?
Not that I know of. I am using the same version, also installed through pip.
Are you sure you are on the correct branch? I dont get this make error. Perhaps we also have differen opam switch configurations?