Performance problem in `type_compatible`

I had a serious performance problem when I was working on the benchmark.
the test project: https://github.com/mirror/make

Expected behavior

Due to the small scale of the project, it is expected that the analysis should be completed soon.

Actual behavior

The analysis of this project lasted for an hour and was still not over.

Troubleshooting Guide

I tried versions 0.1.0 to 0.7.0 and both had the same problem.

I've tried it on cclyzer-souffle and the performance is no problem.

We look forward to your reply.

Can you provide more information about your experiment? What context-sensitivity setting did you use when invoking cclyzer++? Can you share the bitcode file?

Sorry for the unclear question, and Sorry can't share bitcode files due to security regulations.

Experiment

1. Generate bitcode file

gllvm is used to generate bc https://github.com/SRI-CSL/gllvm
get-bc make

2. Generate facts

signatures is disabled
context-sensitivity is set to insensitive

factgen-exe --out-dir <fact-dir> --context-sensitivity insensitive make.bc

3. synthesize and compile the analysis with souffle

souffle --generate=subset.cpp datalog/subset.project
souffle-compile.py subset.cpp
subset --facts <fact-dir> --output <output-dir> -j <n>

4.run the analysis

the Slowest relation

rel
 ----- Relation Table -----
   TOT_T  NREC_T   REC_T  COPY_T  LOAD_T  SAVE_T  TUPLES   READS   TUP/s    ID NAME
     29m   .000s     29m     31s   .000s   .000s   58.2M   2.65B   32.7K  R336 type_compatible_up_to_arg
     22m   .017s     22m   .575s   .000s   .000s   1.42M   5.21B   1.07K  R335 type_compatible
     10m     10m   .000s   .000s   .000s   .000s    639K   6.44K   1.02K  R210 pointer_index
     49s     49s   .000s   .000s   .000s   .033s   13.7K       0     276  R308 subset_lift.alloc_subregion_ctx
     49s     49s   .000s   .000s   .000s   .094s   44.4K       0     896  R305 subset_lift.alloc_contains_ctx
     47s     47s   .000s   .000s   .000s   .105s   52.3K       0   1.10K  R307 subset_lift.alloc_must_alias_ctx
     44s     44s   .000s   .000s   .000s   .000s   10.0K       0     226   R66 class_type_constructor
     30s     30s   .000s   .000s   .000s   .104s   53.1K       0   1.76K  R306 subset_lift.alloc_may_alias_ctx
     16s   .000s     16s   .044s   .000s   .000s    138K   6.20M   8.57K  R337 type_compatible_up_to_field
   9.72s   .000s   9.71s   .007s   .000s   .000s   36.3K   38.5K   3.73K  R303 subset_gep.gep_points_to
   9.29s   .000s   9.29s   .005s   .000s   .000s   23.6K    979K   2.53K  R302 subset_gep.gep_indexes_from
   5.98s   .000s   5.05s   .938s   .000s   .000s   6.44M   7.33M   1.07M  R300 subset_gep._gep_type_compatible
   4.29s   .000s   4.29s   .002s   .000s   .000s   11.2K   11.2K   2.61K  R296 subset_cplusplus.is_init_by_ctor
   2.61s   .000s   2.60s   .003s   .000s   .000s   23.4K   23.5K   8.99K  R131 gep_constant_expr_points_to
   2.35s   .000s   2.35s   .004s   .000s   .000s   50.9K   96.1K   21.6K   R43 allocation_size
   2.35s   .000s   2.33s   .012s   .000s   .000s   45.3K    408M   19.2K  R326 subset_subobjects._alloc_subregion
   1.93s   .000s   1.93s   .001s   .000s   .000s   17.0K   17.0K   8.82K  R333 subset_subobjects.index_in_bounds
   1.69s   .000s   1.68s   .013s   .000s   .000s   45.1K   3.04M   26.6K  R249 static_subobjects._alloc_subregion
   1.66s   .000s   1.66s   .001s   .000s   .000s   17.0K   17.0K   10.1K  R254 static_subobjects.index_in_bounds
   1.55s   .000s   1.55s   .000s   .000s   .000s       6     633       3  R284 subset.type_indication.ty_indication

the Slowest rule, Has produced a very amazing relationsize 58294980!!!

type_compatible_up_to_arg(?from,?to,(i+1)) :-     type_compatible_up_to_arg(?from,?to,i),    func_type_param(?from,i,?param1),    func_type_param(?to,i,?param2),    type_compatible(?param1,?param2).
Src locator-: type-compatibility.dl [49:1-53:38]

  ----- Rule Versions Table -----
   TOT_T  NREC_T   REC_T          TUPLES   VER

     21m   .000s     21m           3.37M
   ---------------------------------------------
     20m   .000s     20m           2.26M     1
     @new_type_compatible_up_to_arg(?from,?to,(i+1)) :-     type_compatible_up_to_arg(?from,?to,i),    func_type_param(?from,i,?param1),    func_type_param(?to,i,?param2),    @delta_type_compatible(?param1,
      FREQ            RELSIZE         ATOM
      0               58294980        type_compatible_up_to_arg
      0               26458           func_type_param

     79s   .000s     79s           1.10M     0
     @new_type_compatible_up_to_arg(?from,?to,(i+1)) :-     @delta_type_compatible_up_to_arg(?from,?to,i),    func_type_param(?from,i,?param1),    func_type_param(?to,i,?param2),    type_compatible(?param1,
      FREQ            RELSIZE         ATOM
      0               --              @delta_type_compatible_up_to_arg
      0               26458           func_type_param

Issue Updated.

Thanks, this will be very helpful for us to debug!

I had a chance to start looking into this. My initial run is still going, but I'm seeing fairly different behavior---it seems to be spending all its time in var_points_to rules, as I'd expect. Out of curiosity, what version of souffle did you use, @1396510211 ?

I'm also curious what's the largest parameter number you see in func_type_param.csv.gz in your facts directory?

The version I'm using is souffle-master,earlier than 2.3 release, I guess. It is worth mentioning that there are different performance bottlenecks on different measured projects, especially on C++ projects, where the situation is particularly complicated due to the participation of STL and smart pointers and templates. The project shown in the figure is an internal C++ project. The make project is not executed for a long time. Therefore, the profile file is not displayed.

I'm also curious what's the largest parameter number you see in func_type_param.csv.gz in your facts directory?

the largest parameter number is 6, We specify a function with no more than five arguments, so add "this" to no more than six.

I suggest predicates vector_compatible_types & type_compatible re-implemented as a souffle functor.

The context-insensitive analysis of make (at least the bitcode I got using gllvm) finished for me in around 155 minutes on a single thread and doesn't seem to have triggered this behavior. I'm using souffle 2.3 as installed using apt-get.

Without a test case, I suspect this will be difficult for us to debug. I would be surprised if performance was improved all that significantly by turning type_compatible into a functor unless there was a really large number of types involved. You could try creating a new .proj file that just includes type_compatible and related helpers (and outputs them) and run just that (it may take a while if you're encountering this performance issue) and see if the results are unexpected.

As I mentioned above, the representation in the picture is not a make project, but I am curious that this time cost is necessary for the analysis of make projects? If not, how can we improve it?

The unification-based context insensitive analysis is quicker, and it is possible that higher context-sensitivity with either analysis could also be faster (depending on the program). I'm also not sure what you mean by "make projects"---forks of make? Or do you mean projects that use make as a build system? If the latter, this shouldn't impact the analysis time in particular.

In general, large programs with significant data structures or inter-procedural dataflow of pointers will be expensive to analyze. In our limited experience, C++ programs also tend to be rather expensive to analyze.

Thinking about this a bit more, a potential problem is that type_compatible computes a subset of the cartesian product of types. If you have lots of types (particularly structs or function types), this can get big quite fast. How many types does your program have?

Assuming that is the problem, there are a couple potential avenues to fix it. One would be to turn type_compatible into a functor, as @nzinfo suggests. A second would be to experiment with soufflés magic set transformation to try to compute type_compatible in a more on-demand fashion. A third would be to do that explicitly by introducing a relation that computes which types need to be checked for compatibility and introduce that as a filter on type_compatible (see the _ptr_reaches_to and _ptr_reaches_to_request idiom

cclyzerpp/datalog/points-to/signatures.dl

Line 239 in b182252

.decl _ptr_reaches_to_request(?ptrACtx: Context, ?ptrAlloc: Allocation)

).

I think the way to try the second option would be passing --magic-transform="type_compatible, type_compatible_up_to_arg" to souffle during program generation. If you give that a try and it helps, let us know and we can add the relevant pragmas to the source. That list may also need to include where type_compatible is queried and/or some further helper relations....