bazelbuild/bazel

Unless "header.h" is declared in `srcs`, target won't be rebuilt on change (with no warning)

Closed this issue · 20 comments

Given a simple test:

cc_test(
  name = "foo",
  srcs = ["foo.cpp"],
)

... where foo.cpp is:

#include "dummy.h"
...
void main(void) ...

Changing dummy.h will not result in a rebuild of foo (even when sandboxing is enabled) unless dummy.h is also declared in srcs. Unfortunately, Bazel not issue a warning if dummy.h is not included in srcs, allowing for silent build corruption.

Is this expected? It seems to go against the spirit of Bazel. System includes are specifically exempted from undeclared inclusion warnings in CROSSTOOL (see cxx_builtin_include_directory), so I'd expect a warning for undeclared local headers. Similarly, I expect that Bazel notes all the files the compiler opens, not just system includes.

Is this on Linux or Mac?

On Monday, 21 September 2015, kayasoze notifications@github.com wrote:

Given a simple test:

cc_test(
name = "foo",
srcs = ["foo.cpp"],
)

... where foo.cpp is:

#include "dummy.h"
...
void main(void) ...

Changing dummy.h will not result in a rebuild of foo (even when
sandboxing is enabled) unless dummy.h is also declared in srcs.
Unfortunately, Bazel not issue a warning if dummy.h is not included in
srcs, allowing for silent build corruption.

Is this expected? It seems to go against the spirit of Bazel. System
includes are specifically exempted from declaration warnings in CROSSTOOL,
so I'd expect local sources not to be. Similarly, I expect that Bazel
watches all the files the compiler opens.


Reply to this email directly or view it on GitHub
#473.

It occurs on both.

Well, it's true that you need to put the file in srcs (or hdrs). What's not working is that this exact case should be caught by either sandboxing or hdrs_check (that last one is unfortunately not set to strict by default, which it ought to be).

I would expect that even with hdrs_check = "loose" incremental rebuilds would occur correctly when an undeclared header changes. This assumes that Bazel monitors all header files used by the compiler, either via sandboxing or strace() or FUSE, and stores dependent header information in its own database. This is the approach taken by other tools like ElectricAccelerator (http://electric-cloud.com/products/electricaccelerator), but perhaps the overhead is too great, and this is why hdrs_check="loose" exists in the first place. If this is true, please consider adding a very large warning to the documentation that hdrs_check = "loose" breaks incremental rebuilds.

However, in practice, it turns out that neither a warning nor a correct rebuild occur even with hdrs_check = "strict"

I traced this down to Constants.HARD_DISABLE_CC_INCLUDE_SCANNING being set to true. Changing that to false means Bazel rebuilds correctly with hdrs_check = "loose" and gives an error with hdrs_check = "strict" as expected. Does anybody know why C++ include scanning is disabled (possibly only in the open source version)?

It's disabled in the open source version because we don't have an include
scanner implementation. I still don't understand why it doesn't rebuild
correctly, or doesn't give an error if the input file is undeclared.

On Mon, Oct 5, 2015 at 5:42 PM, bsilver8192 notifications@github.com
wrote:

I traced this down to Constants.HARD_DISABLE_CC_INCLUDE_SCANNING being set
to true. Changing that to false means Bazel rebuilds correctly with hdrs_check
= "loose" and gives an error with hdrs_check = "strict" as expected. Does
anybody know why C++ include scanning is disabled (possibly only in the
open source version)?


Reply to this email directly or view it on GitHub
#473 (comment).

I'm not sure what you mean by "include scanner implementation". There is code in the open source version which examines the .d files from the compiler, but is disabled because of the setting of that constant.
When Constants.HARD_DISABLE_CC_INCLUDE_SCANNING is set to false, DependencySet is used to parse the .d files from the C/C++ compiler. CppCompileAction.validateInclusions then checks to make sure all of the included files are declared as inputs with hdrs_check = "strict" or adds them as inputs at the next rebuild with hdrs_check = "loose".

Forcing Constants.HARD_DISABLE_CC_INCLUDE_SCANNING to false makes Bazel correct again, at least in my limited testing.

In my opinion, this should be a release blocker, and a unit test added to cover this case. If a build tool can't do incremental builds correctly, it is dangerous to use. As a Xoogler, I pushed hard for adoption of Bazel within my organization, but when coworkers found this bug, it made me look like... a bit of a jackass. While C++ modules may be the holy grail, C++ dependency scanning should still work for non-modularized codebases.

This seems like a pretty serious correctness issue. Any update on whether this is as trivial to fix as flipping that constant or if that causes other subtle problems?

This might be a regression too. I remember Bazel catching this error before (with --hdrs_check=strict), but I see no way the current version can ever do that.

It is indeed a regression, created by this commit by philwo@: dd16491

That change also includes some confusing code reformatting, but all it does is turn off this feature according to Constants.HARD_DISABLE_CC_INCLUDE_SCANNING, so likely flipping that bit makes Bazel as correct as it was before.

I could imagine sandboxing eventually replacing legacy include scanners, but it doesn't seem like Bazel is there yet (at least for C++ code); sandboxing is not yet supported on all platforms, to boot.

Good catch. I think writing @philwo will notify him so maybe he can comment on why that got disabled.

I'll have a look at this issue tomorrow. You are absolutely right, in that the following conditions should hold, no matter whether you use the standalone or the sandboxed strategy:

  • When header files listed in the srcs or hdrs of a target have changed, Bazel should rebuild the target.
  • When you forget to add header files to srcs / hdrs that are required by the target, the build should fail. (In case the sandbox is used, it will fail, because the unlisted files are simply not there, in case the sandbox is not used, it will fail, because Bazel does an after-the-fact check which include files were accessed [--hdrs_check=strict].)

There should also be a test that catches if it doesn't work as just described, of course. Commit dd16491 was not intended to affect the two above mentioned conditions or the behavior of hdrs_check. It merely disables a feature that is not available in Bazel. If it caused the regression you've seen, this is unintended and will be fixed.

AFAIK we have no plans to remove hdrs_check, as it's a helpful check on systems where sandboxing is not available. I'll also ping a colleague tomorrow to make sure the user manual is up to date regarding hdrs_check.

Thanks for taking a look at this.

When using --hdrs_check=loose or --hdrs_check=warn, it looks like Bazel tries to handle incremental rebuilds correctly even when header files are missed from srcs and hdrs. In my limited testing (with HARD_DISABLE_CC_INCLUDE_SCANNING flipped), that seems to work. I don't currently want to use that, but it seems like a potentially useful (and somewhat surprising) feature which people might use if it's documented.

Related question this brings up: is C++ compilation supposed to use the sandbox? My experience has been that even on systems which do support sandboxing, C++ compilation done by Bazel is never sandboxed. I did my testing related to this issue on a system which uses the sandbox, and Bazel happily compiled C++ files which include headers not declared as inputs in any way.

I can confirm that with Bazel HEAD, neither hdrs_check nor sandboxing work for C++ compilation and links. Luckily, according to my tests, (re-)builds are still correct when all inputs are specified in srcs / hdrs, but when they're not, Bazel currently does not alert you to this fact and does not rebuild targets if non-specified files changed.

As this is a serious regression & possible correctness issue, I'll fix this ASAP. I'll keep you updated. Hopefully I have a full fix & tests for this by tomorrow.

Please don't rely on hdrs_check=loose or warn, even if it seems to work - these should eventually go away. The "auto detection" behavior will certainly not work with sandboxing or any upcoming remote execution strategy and should really be avoided.

So, I now have a fix for this that:

  • Makes C++ compilation finally work with sandboxing.
  • Makes hdrs_check work again for when you can't / don't want to use sandboxing.
  • Adds tests for everything.

There's one remaining issue that I have to fix tomorrow morning, then it goes to code review and then live. Will keep you updated.

Update: I've submitted the temporary workaround of flipping the HARD_DISABLE_CC_INCLUDE_SCANNING constant to false again, while the more complex fix that really fixes the root cause is in review. This makes hdrs_check work again, while sandboxing will still not work (it silently falls back to non-sandboxed execution). The change should get pushed to the repository later today.

This bug still exists, but in more restricted form. It reproduces only when:

  1. Sandboxing is disabled (or building on a non-sandbox-supported platform, like OS X)
  2. Include files are inside a directory referenced with includes, e.g. includes = ["."]

Could you please reopen?

Sadly, I cannot enable sandboxed builds because of #420 (and because Mac OS X).

Use of cc_inc_library() results in correct incremental compilation, and so is a resonable workaround for includes = ["."]; however, it will not result in the reporting of undeclared includes.

The followup issue is tracked in #1162.