typetools/checker-framework

Line numbers for delomboked code

mernst opened this issue ยท 16 comments

Code that Lombok processes is not necessarily legal Java code. (Lombok does various transformations and writes out legal Java bytecodes.) To run the Checker Framework on code that Lombok processes, it is necessary to run the Checker Framework on the output of the delombok tool. Then, the Checker Framework error messages refer not to the user-supplied code, but to temporary directories with modified versions of what the programmer wrote. It is a hassle for programmers to determine where to fix the problems in their code. This hassle is a deal-breaker for some programmers, making them not want to use the Checker Framework.

A solution to this problem would be useful for all static analysis tools, not just the Checker Framework.

A possible solution is:

  • Create a mapping of line numbers from the delomboked code to the developer's source code.
  • Write a tool that post-processes static analysis output: it changes errors/warnings that refer to the delomboked code into errors/warnings that refer to the programmer-written code. Perhaps the postprocessed output would still contain the original warning in the delomboked source code, so the programmer can use it if necessary.

Here are two approaches to create the mapping:

  • Modify the delombok tool. It already has methods endPos() and lineEndPos() that might have some of the needed information. Currently they seem to be used primarily to make sure that comments are copied to the delomboked file.
  • Do textual processing of the original source code and/or the debug information in the .class file (which has correct line numbers), plus of the delomboked source code.

@mernst wrt. to To run the Checker Framework on code that Lombok processes, it is necessary to run the Checker Framework on the output of the delombok tool.... actually, I observe in my project (http://github.com/VirtusLab/git-machete-intellij-plugin/) that it's NOT necessary to run delombok tool for CF to see the output of Lombok annotation processor, including getters/setters, c'tors etc... or am I missing something?

Please take a look at https://github.com/VirtusLab/git-machete-intellij-plugin/blob/develop/build.gradle#L157, this is basically the differences that I observed b/w running CF on delomboked code vs just on the compilation trees produced by Lombok's annotation processor.

I'm not sure though how it's even possible that CF's annotation processor always seems to run after Lombok's annotation processor in our project... is it guaranteed somehow, or is it just by chance in or project (but possibly not in others)?

Also, #3356 is related to exactly this problem - thanks guys for addressing that! Note that it's easy to turn off @SuppressWarnings("all") generation for delomboked code with some flag to delombok, but NOT for the "on the fly" mode where Lombok's anno processor is not dumped to the disk before CF runs.

This is what I experienced, too. I'm using the Nullness and Optional Checkers (v3.16.0) on a codebase of about 10000 LOC, and the code contains several lombok (v1.18.20) annotations (@UtilityClass, @Slf4j, @*ArgsConstructor, @Data, @Value, @Getter, @Jacksonized, @SuperBuilder, @EqualsAndHashcCode, @ToString, @NonNull). There's also mapstruct (v1.4.2.Final) present, and the checkers print warnings about the mapstruct-generated code as well.

I'm building with maven-compiler-plugin, supplying the annotation processors in the following order:

<annotationProcessor>org.checkerframework.checker.nullness.NullnessChecker</annotationProcessor>
<annotationProcessor>org.checkerframework.checker.optional.OptionalChecker</annotationProcessor>
<annotationProcessor>org.mapstruct.ap.MappingProcessor</annotationProcessor>
<annotationProcessor>lombok.launch.AnnotationProcessorHider$AnnotationProcessor</annotationProcessor>

I compared these results with delomboking first. The only difference I observed was the line numbers. Most importantly, I did not see any missing warnings when not delomboking.

As a lombok contributor, I'm interested in cases where this setup does not work. Do we have a example for that? Maybe there is something lombok can do differently to improve the interplay with the Checker Framework?

I've experienced false negatives related to interaction between Lombok and the Checker Framework before. Unfortunately, I don't have a clean example (all of this is based on my recollection from ~2+ years ago, so things may have changed), but I'm just going to lay out what I remember/can recover from my old notes here. If you want to do a deeper dive, I should be able to put together a working example.

As I recall, the issue is that most Checker Framework checkers rely on the presence of javac Trees for checked code in their AnnotatedTypeFactorys. Code generated at compile time by Lombok sometimes doesn't have an associated Tree (only an Element), so a typechecker that relies on the presence of a Tree might fail to properly assign a type to the generated code. My notes aren't clear on exactly when this happens, either :(

My notes do point to some old, orphaned commits that were the experiments that convinced me that delomboking was necessary for the CF to remain sound. The last of those commits is kelloggm/object-construction-checker@daeec9c; the others are its parents kelloggm/object-construction-checker@a6b1d96, kelloggm/object-construction-checker@552348c, and kelloggm/object-construction-checker@60522e5. (All of these commits are part of a now-deprecated repository that became the Checker Framework's Called Methods Checker). I suspect that it would be possible to construct an example of a false negative caused by the interaction between Lombok and the CF from those for the modern Called Methods Checker, with a few hours of work; if you're interested in investigating this from the Lombok side, I'd be happy to do it.

As I have several @SuperBuilders in my project, I'll give it a try myself first by playing around with the Called Methods Checker. If I'm unable to find anything odd, it'd be great if you could try to construct an example.

@janrieke I'd also be interested to hear about your experience running the Called Methods Checker on Lombok builder code without de-lomboking. If you run into issues please let us know!

I just compared the results of the CalledMethodsChecker, but still found no differences between running it directly in combination with lombok 1.18.20, and delomboking first. It may well be that lombok now generates better or more complete AST elements; there have been some improvements in lombok in this regard recently.

However, I found some false positives with @SuperBuilder; the same code works for @Builder, so it's probably something in org.checkerframework.checker.calledmethods.builder.LombokSupport. I'll create a new issue for that when I found out more details.

So if you could construct an example that does not work directly, I'll try investigating it from the lombok side.

With a few small modifications, the old test I mentioned does still work with the current version of the Called Methods Checker. I opened a draft PR that should allow the problem to be reproduced: #4826. In particular, the CMC should issue a warning about the first test() method, because the required setters weren't called on the builder. You can compare this test to the version of the test that is delombok'd first: https://github.com/typetools/checker-framework/blob/master/checker/tests/calledmethods-lombok/LombokBuilderExample.java, which is run as part of ./gradlew CalledMethodsLombokTest by the CF's build system.

It's worth noting that the Called Methods Checker is the CF checker that's most aware of Lombok - it has special logic to understand what Lombok will do with an @Builder annotation, and it's possible that that is what is broken here. I can try to produce a similar example for the Nullness Checker (or some other checker), to help be sure that is not the issue. When I find one, I will post it in this thread as well.

I've pushed a similar example that uses the Nullness Checker to demonstrate the unsoundness to the same branch I mentioned in my previous comment (you should be able to see it looking at the same diff in #4826).

@janrieke

So if you could construct an example that does not work directly, I'll try investigating it from the lombok side.

A ping in case you missed it. @kelloggm created a test case reproduces the problem. We would greatly appreciate insight from the lombok side that would improve the integration between the Checker Framework and lombok.

No, I didn't miss it, just too much work and then vacation ๐Ÿ˜€
I hope to find some time next week.

Vacation is a great excuse. Thanks for the update, and thanks in advance for your investigation.

I haven't identified the exact cause of the issue, yet. However, the general problem seems to be that lombok may not have done all its modifications to the AST when the CF starts. Depending on what work has been done in which AP round (of both lombok and CF), this may or may not result in a correct CF analysis.

There was a similar issue in connection with mapstruct, where mapstruct could see some, but not all of lombok's generated code (see mapstruct/mapstruct#510). That problem was solved by changes both in lombok and in mapstruct:

  • mapstruct defers some of its processing until lombok has finished all of its work. (Some of those changes are here.)
  • lombok added a new lombok-mapstruct-binding AP, implementing a special mapstruct interface to let mapstruct know whether it has finished.

I'm getting more and more convinced this is probably the only way towards interoperability without delomboking.

Apologies for the spam, but this thread seems to be the de-facto site for discussion of Checker Framework / Lombok interaction.


I have done a lot of investigation over the past few weeks, and I think the Checker Framework and Lombok actually can be run together in a single invocation of javac. This avoids the need for delombok and provides correct line numbers.

I have put together a demo here --> https://github.com/Calvin-L/checker-framework-plus-lombok

The crux of the approach is to run the Checker Framework processors first. The demo project readme gives more information about why the order matters and why the Checker Framework has to be listed first.

Very interesting @Calvin-L! It'd be cool if in fact CF and Lombok could be run together.

I wonder if the interaction flakiness would be reduced further if at some point CF integrated into javac as a plugin rather than an annotation processor, like Error Prone does. With a complicated build configuration, it may not be so easy to ensure that in the end the Checker Framework processors end up sitting before Lombok in the processor path.