commercialhaskell/stack

Respect TemplateHaskell addDependentFile dependency changes

Closed this issue · 21 comments

For example, when building stack itself.

  • Make a local change
  • stack build
  • gitrev writes the version as "dirty"
  • commit
  • stack build

What should happen:

  • stack recognizes that the dependency changed and rebuilds accordingly

What currently happens:

  • stack does not recognize that the dependency changed

I just did some research on this, and the way to handle this is to call ghc --show-iface on the generate .hi files, and glean relevant information from them.

Here's my intended approach to working on this.

When formulating a build plan (a specific arch, ghc, lts are selected)

  1. locate the place(s) where "get source files for this local package'" is calculated
  2. locate .hi files (if present) for the .hs/.lhs files already discovered
  3. calculate additional file dependencies
    • 3.1 for TH, it's just file names, relative to that package's root (I don't believe there's such a thing as extra-source-dirs?)
    • 3.2 for module dependencies not already accounted for, look for these in any of the relevant hs-source-dirs
  4. print warnings about these additional dependencies if they are not already accounted for in the cabal file, suggest adding them to the cabal file with copy/pasteable output. (extra-source-files for 3.1, other-modules for 3.2)
  5. return the augmented list of source files

As I understand it, the build cache should smoothly take it from there, as long as we tell it about these extra files.

Some nuances and thoughts regarding:

  • locating .hi files:
    getInstallationRoot gets us to .stack-work/install/$arch/$lts/$ghc/.
    From there we need to drill down into lib/$arch-ghc-$ghc/$package-$packageVer/,
    and the .hi files for that package are in module hierarchy locations from there.
  • dependencies of dependencies
    Steps 2 thru 5 may need to be repeated until no new file dependencies are discovered.
    Limit N repeats for sanity?
  • .hi files absent prior to first build
    The build process should work correctly anyways since it is a clean build.
    I'm guessing subtle bugs may creep in here.
    The warnings, as outlined above, won't trigger unless we add a post-build check.
  • first time build cache
    The first build cache won't include these extra files, which may lead to extra rebuilding.
    This is a second reason to add a post-build check: it can update the build cache at that time.
  • build cache details
    We might need some extra way of marking a module as "needs rebuild due to dirty file dependency". I'm not quite sure what the implementation details are here.

I might stub out the "module dependencies" part (3.2) if it is tricky.

One shortcoming of this approach is that it only works for a library target. I can't find the .hi file for the main-is in an executable target.

Never mind. The .hi file appears to be in .stack-work/dist/$arch/Cabal-$cabalLibVer/build/$package/$package-tmp/. I'm not sure how reliably persistent this is, but it is mildly annoying that it is in a different location.

Package files are gotten in packageDescFiles.

Other notes:

  • beware of circular dependencies
  • the "repeat" procedure only needs to be calculated on the new files, not the old ones

We need to also consider performance here. Having GHC dump the contents of
a hi file may be non trivial, in which case caching that information when
the files haven't changed is a good idea.

On Thu, Jul 16, 2015, 11:01 AM Dan Burton notifications@github.com wrote:

Other notes:

  • beware of circular dependencies
  • the "repeat" procedure only needs to be calculated on the new files,
    not the old ones


Reply to this email directly or view it on GitHub
#105 (comment)
.

Discussion in PR #734.

This still seems to be happening for multiple-package projects.

I'm still experiencing this issue with a trivial single-executable project. I've uploaded a simple reproduction.

The example doesn't work correctly on my machine, if the correct stack behavior is to build twice and thus produce two different hashes. @snoyberg, you may have run it wrong, I think the author accidentally committed thing to the repo :)

You're right, I was getting confused by the repo. Nonetheless, the bug isn't in Stack, it's in gitrev, which seems to be mishandling whitespace. I'll open up an issue. (This is also why we generally ask for minimal reproducing test cases, to ensure that the bug is really in Stack and not some other library.)

My apologies for the poor quality of the repro. The gitrev PR fixes this issue for me.

nh2 commented

Possibly relevant for people who read this issue: acfoltzer/gitrev#17

Document the fact that addDependentFile on .git/index can lead to unnecessary recompilation

nh2 commented

I'm thinking this whole git logic might sit much better in a buildHook or a similar user hook.

But a question @snoyberg / @DanBurton:

If such a hook was used, would that still work with stack's own recompilation avoidance? Because if stack doesn't run things like buildHook in all cases, then putting the logic there wouldn't work for stack.

No idea, I've never really understood how user hooks are triggered in Cabal. But I'd stay far away from a solution that uses any kind of custom Setup.hs file. I'd rather just recompile every module every time I hit build than that (only half exaggerating).

nh2 commented

No idea, I've never really understood how user hooks are triggered in Cabal.

As far as I can tell, buildHook and preprocessors are run on very build, so that's fine. My question was more about Stack's extra recompilation avoidance in multi-project setups, where (IIRC) it doesn't even configure when it notices (via mtime) that nothing has changed.

I'd rather just recompile every module every time

@snoyberg This is literally what's happening for me when using only TH and no hook, but I'm having a hard time with it because it just makes iterating so slow. So right now I'm in favour of the hook.

nh2 commented

@mgsloan Has confirmed in chat with me that indeed stack's own recompilation avoidance will prevent hooks from running if files on disk didn't change. That means that right now, using only a buildHook won't work with stack, it will not be run if no file tracked by stack is changed.

nh2 commented

More discussion:

@mgsloan Right now when using gitrev and its TH instead of a hook, stack knows when it has to recompile because it tracks the addDependentFile ".git/index" done by gitrev. Are there other ways to get a file into the list of files watched by stack beyond addDependentFile?

Because I think the solution would be to put .git/index into this list, but not via addDependentFile, so that only stack knows about that file but not GHC. Then when a commit is made and index changes, stack will know that it has to rebuild, it will run the Setup.hs hooks, and the hook will correctly discover whether or not an actual commit was made.

@mgsloan said:

Yeah I suppose that would work. Perhaps extra-source-files would do the trick.