Preprocessing LLVM IR?

Question

Preprocessing LLVM IR?

adrianherrera opened this issue 2 years ago · 3 comments

Hi again! 👋

The cclyzer++ docs do a great job explaining unsupported language features. Given this, is a recommended set of passes that should be run over the LLVM IR before running cclyzer++? For example, the clam abstract interpreter provides the clam-pp tool to get the IR into a state amenable to clam's abstract interpreter. Is it necessary/preferable to do something similar before running the fact-generator/static analysis? For example, one of the unsupported is exception handling (e.g., via the resume and landingpad instructions). However, if you preprocess the IR with LLVM's LowerInvoke pass these instructions go away.

Is this a useful feature? Are there any performance benefits to be gained from this?

Answer 1 · 2023-03-14T13:47:15.000Z

I hadn't really thought about it! You can pass -fno-vectorize to Clang to avoid some of the vectorization issues, we should recommend that in the docs if we don't already. I hadn't heard of LowerInvoke, but it sounds useful! I'm amenable to adding that to the docs as well.

Lastly, I'll say that the effect of the current sources of unsoundness depends a lot on what exactly you care about analyzing. For a lot of our use-cases, exceptional data-flows aren't critical (hence, we haven't prioritized fixing them). That being said, I'm of course not satisfied as long as there's obvious unsoundness.

Answer 2 · 2023-03-14T14:02:20.000Z

Are there any performance benefits to be gained from this?

In general, the more unsound the analysis, the faster it goes 😄 So there are probably performance detriments

Answer 3 · 2023-03-14T22:10:53.000Z

Thanks @langston-barrett!

In general, the more unsound the analysis, the faster it goes 😄 So there are probably performance detriments

Hehe yeah ok makes sense.

I just submitted a PR to include some discussion on how to reduce the impact of these unsupported code constructs. Let me know what you think!