Performance problem in `type_compatible`
1396510211 opened this issue · 14 comments
I had a serious performance problem when I was working on the benchmark.
the test project: https://github.com/mirror/make
Expected behavior
Due to the small scale of the project, it is expected that the analysis should be completed soon.
Actual behavior
The analysis of this project lasted for an hour and was still not over.
Troubleshooting Guide
I tried versions 0.1.0 to 0.7.0 and both had the same problem.
I've tried it on cclyzer-souffle and the performance is no problem.
We look forward to your reply.
Can you provide more information about your experiment? What context-sensitivity setting did you use when invoking cclyzer++? Can you share the bitcode file?
Sorry for the unclear question, and Sorry can't share bitcode files due to security regulations.
Experiment
1. Generate bitcode file
gllvm is used to generate bc https://github.com/SRI-CSL/gllvm
get-bc make
2. Generate facts
signatures
is disabled
context-sensitivity
is set to insensitive
factgen-exe --out-dir <fact-dir> --context-sensitivity insensitive make.bc
3. synthesize and compile the analysis with souffle
souffle --generate=subset.cpp datalog/subset.project
souffle-compile.py subset.cpp
subset --facts <fact-dir> --output <output-dir> -j <n>
4.run the analysis
the Slowest relation
rel
----- Relation Table -----
TOT_T NREC_T REC_T COPY_T LOAD_T SAVE_T TUPLES READS TUP/s ID NAME
29m .000s 29m 31s .000s .000s 58.2M 2.65B 32.7K R336 type_compatible_up_to_arg
22m .017s 22m .575s .000s .000s 1.42M 5.21B 1.07K R335 type_compatible
10m 10m .000s .000s .000s .000s 639K 6.44K 1.02K R210 pointer_index
49s 49s .000s .000s .000s .033s 13.7K 0 276 R308 subset_lift.alloc_subregion_ctx
49s 49s .000s .000s .000s .094s 44.4K 0 896 R305 subset_lift.alloc_contains_ctx
47s 47s .000s .000s .000s .105s 52.3K 0 1.10K R307 subset_lift.alloc_must_alias_ctx
44s 44s .000s .000s .000s .000s 10.0K 0 226 R66 class_type_constructor
30s 30s .000s .000s .000s .104s 53.1K 0 1.76K R306 subset_lift.alloc_may_alias_ctx
16s .000s 16s .044s .000s .000s 138K 6.20M 8.57K R337 type_compatible_up_to_field
9.72s .000s 9.71s .007s .000s .000s 36.3K 38.5K 3.73K R303 subset_gep.gep_points_to
9.29s .000s 9.29s .005s .000s .000s 23.6K 979K 2.53K R302 subset_gep.gep_indexes_from
5.98s .000s 5.05s .938s .000s .000s 6.44M 7.33M 1.07M R300 subset_gep._gep_type_compatible
4.29s .000s 4.29s .002s .000s .000s 11.2K 11.2K 2.61K R296 subset_cplusplus.is_init_by_ctor
2.61s .000s 2.60s .003s .000s .000s 23.4K 23.5K 8.99K R131 gep_constant_expr_points_to
2.35s .000s 2.35s .004s .000s .000s 50.9K 96.1K 21.6K R43 allocation_size
2.35s .000s 2.33s .012s .000s .000s 45.3K 408M 19.2K R326 subset_subobjects._alloc_subregion
1.93s .000s 1.93s .001s .000s .000s 17.0K 17.0K 8.82K R333 subset_subobjects.index_in_bounds
1.69s .000s 1.68s .013s .000s .000s 45.1K 3.04M 26.6K R249 static_subobjects._alloc_subregion
1.66s .000s 1.66s .001s .000s .000s 17.0K 17.0K 10.1K R254 static_subobjects.index_in_bounds
1.55s .000s 1.55s .000s .000s .000s 6 633 3 R284 subset.type_indication.ty_indication
the Slowest rule, Has produced a very amazing relationsize 58294980!!!
type_compatible_up_to_arg(?from,?to,(i+1)) :- type_compatible_up_to_arg(?from,?to,i), func_type_param(?from,i,?param1), func_type_param(?to,i,?param2), type_compatible(?param1,?param2).
Src locator-: type-compatibility.dl [49:1-53:38]
----- Rule Versions Table -----
TOT_T NREC_T REC_T TUPLES VER
21m .000s 21m 3.37M
---------------------------------------------
20m .000s 20m 2.26M 1
@new_type_compatible_up_to_arg(?from,?to,(i+1)) :- type_compatible_up_to_arg(?from,?to,i), func_type_param(?from,i,?param1), func_type_param(?to,i,?param2), @delta_type_compatible(?param1,
FREQ RELSIZE ATOM
0 58294980 type_compatible_up_to_arg
0 26458 func_type_param
79s .000s 79s 1.10M 0
@new_type_compatible_up_to_arg(?from,?to,(i+1)) :- @delta_type_compatible_up_to_arg(?from,?to,i), func_type_param(?from,i,?param1), func_type_param(?to,i,?param2), type_compatible(?param1,
FREQ RELSIZE ATOM
0 -- @delta_type_compatible_up_to_arg
0 26458 func_type_param
Issue Updated.
Thanks, this will be very helpful for us to debug!
I had a chance to start looking into this. My initial run is still going, but I'm seeing fairly different behavior---it seems to be spending all its time in var_points_to rules, as I'd expect. Out of curiosity, what version of souffle did you use, @1396510211 ?
I'm also curious what's the largest parameter number you see in func_type_param.csv.gz
in your facts directory?
The version I'm using is souffle-master,earlier than 2.3 release, I guess. It is worth mentioning that there are different performance bottlenecks on different measured projects, especially on C++ projects, where the situation is particularly complicated due to the participation of STL and smart pointers and templates. The project shown in the figure is an internal C++ project. The make project is not executed for a long time. Therefore, the profile file is not displayed.
I'm also curious what's the largest parameter number you see in
func_type_param.csv.gz
in your facts directory?
the largest parameter number is 6, We specify a function with no more than five arguments, so add "this" to no more than six.
I suggest predicates vector_compatible_types & type_compatible re-implemented as a souffle functor.
The context-insensitive analysis of make
(at least the bitcode I got using gllvm) finished for me in around 155 minutes on a single thread and doesn't seem to have triggered this behavior. I'm using souffle 2.3 as installed using apt-get.
Without a test case, I suspect this will be difficult for us to debug. I would be surprised if performance was improved all that significantly by turning type_compatible into a functor unless there was a really large number of types involved. You could try creating a new .proj
file that just includes type_compatible
and related helpers (and outputs them) and run just that (it may take a while if you're encountering this performance issue) and see if the results are unexpected.
As I mentioned above, the representation in the picture is not a make project, but I am curious that this time cost is necessary for the analysis of make projects? If not, how can we improve it?
The unification-based context insensitive analysis is quicker, and it is possible that higher context-sensitivity with either analysis could also be faster (depending on the program). I'm also not sure what you mean by "make projects"---forks of make? Or do you mean projects that use make as a build system? If the latter, this shouldn't impact the analysis time in particular.
In general, large programs with significant data structures or inter-procedural dataflow of pointers will be expensive to analyze. In our limited experience, C++ programs also tend to be rather expensive to analyze.
Thinking about this a bit more, a potential problem is that type_compatible
computes a subset of the cartesian product of types. If you have lots of types (particularly structs or function types), this can get big quite fast. How many types does your program have?
Assuming that is the problem, there are a couple potential avenues to fix it. One would be to turn type_compatible
into a functor, as @nzinfo suggests. A second would be to experiment with soufflés magic set transformation to try to compute type_compatible
in a more on-demand fashion. A third would be to do that explicitly by introducing a relation that computes which types need to be checked for compatibility and introduce that as a filter on type_compatible
(see the _ptr_reaches_to
and _ptr_reaches_to_request
idiom
cclyzerpp/datalog/points-to/signatures.dl
Line 239 in b182252
I think the way to try the second option would be passing --magic-transform="type_compatible, type_compatible_up_to_arg"
to souffle during program generation. If you give that a try and it helps, let us know and we can add the relevant pragmas to the source. That list may also need to include where type_compatible
is queried and/or some further helper relations....