snowleopard/hadrian

Get rid of ghc-cabal and package-data.mk

snowleopard opened this issue · 67 comments

A large part of the build system is dedicated to dancing around package-data.mk files containing package-related data (such as package name, version, dependencies, etc.). These files are generated by utils/ghc-cabal program, which also needs to be built in a non-trivial way. See Rules/Data.hs and Oracles/PackageData.hs in particular.

Our long term plan is to eliminate this and get rid of ghc-cabal and package-data.mk files altogether. See Rules/Cabal.hs where Distribution.Package is used instead of ghc-cabal to extract package dependencies directly from .cabal files.

A potential solution will need to be carefully thought through and discussed. This is the thread to do this.

Let me formulate the rationale behind this more clearly:

  • Separation of build artefacts from sources: ghc-cabal creates package-data.mk files in a fixed location inside a package directory. This prevents us from moving build artefacts outside the source tree #113.
  • Performance: ghc-cabal always runs expensive configure scripts, significantly affecting performance of the whole build system. On top of that, parallel invocations of ghc-cabal are currently broken.
  • Complexity: Having a standalone ghc-cabal program built in a non-trivial way and package-data.mk files that need to be parsed significantly adds to the complexity of the build system and hence makes it more difficult to understand and maintain. Right now this is undoubtedly the most complicated and unreliable part of the build system.

Another example of accessing the information stored in *.cabal files directly: cf825fe

My understanding is, that we would basically absorb ghc-cabal into the shake based build system, instead of shelling out to ghc-cabal?

@angerman Yes, that's the intention.

This is now being attempted in ghc-cabal branch. I merged @bgamari's PR #108 and now trying to compile. Something is wrong with deriving Generic.

@ndmitchell @angerman I'd appreciate your help with ghc-cabal branch if you've got time, as I won't be able to contribute much in the next 24h. The goal is to compile it and make sure it doesn't break the build. I'll then take over and use it to finalise #113.

I'm not at a machine with MSYS/Mingw in the next 24 hours, so I'm not much use I'm afraid.

I'll give this a try.

Alright, I've tried to give this a go. But I think this is not possible, I believe we run into this: https://ghc.haskell.org/trac/ghc/ticket/10514. We simply can not derive Generic for GADTs as it seems.
I.e. I couldn't even get this to compile in any way

data X a where
    A :: X Bool
    B :: X [String]

using deriving or standalone deriving.

  • deriving erring out with:

    Can't make a derived instance of ‘Generic (X a)’:
      Constructor ‘A’ has existentials or constraints in its type
      Constructor ‘B’ has existentials or constraints in its type
      Possible fix: use a standalone deriving declaration instead
    In the data declaration for ‘X’
    
  • Standalone Deriving erring out with:

    Can't make a derived instance of ‘Generic (X a)’:
      A must be a vanilla data constructor, and
      B must be a vanilla data constructor
    In the stand-alone deriving instance for
      ‘Generic a => Generic (X a)’
    

While I do not see Generic explicitly in @bgamari #108, even deriving instance Hashable (X a) wants Generic (X a).

Can't you just write your own Hashable by hand? It shouldn't be too difficult.

Did I ever mention I have no idea what I'm doing? ;-)

We also need Binary. Let me see if I manage to build Hashable. There is however https://hackage.haskell.org/package/instant-hashable

You can do deriving Show on a GADT, and then you can just make Hashable by doing hashWithSalt salt x = hashWithSalt salt $ show x. For Binary you need to undergo a bit more pain, and do it by hand.

This is what we have in Way.hs. Does this help?

-- Instances for storing in the Shake database
instance Binary Way where
    put = put . show
    get = fmap read get

instance Hashable Way where
    hashWithSalt salt = hashWithSalt salt . show

instance NFData Way where
    rnf (Way s) = s `seq` ()

@ndmitchell thanks!

@snowleopard if we had Read, that would work.

Hmm. We could of course just go ahead and implement read for each and every single one on our own ...

@snowleopard, if you're going to do that, you might as well do: instance NFData Way where rnf = rnf . show, which again is entirely relying on read/show. Not sure how hard it is to do Read for a GADT, or if it's even possible with the given type signature.

You can (kind of) read a GADT, but the type index must be
existentially hidden:

data Hidden :: (* -> *) -> * where
  Hide :: gadt index -> Hidden gadt

So if you have

data MyGadt :: * -> * where
  C0 :: MyGadt ()
  C1 :: a -> MyGadt a

You can write a Read instance on Hidden MyGadt.

On 1/12/16, Neil Mitchell notifications@github.com wrote:

@snowleopard, if you're going to do that, you might as well do: instance NFData Way where rnf = rnf . show, which again is entirely relying on
read/show. Not sure how hard it is to do Read for a GADT, or if it's
even possible with the given type signature.


Reply to this email directly or view it on GitHub:
#18 (comment)

I wonder if the GADTs are an essential part of this patch, or could be removed? They seem to be causing a lot of problems.

I wonder if the GADTs are an essential part of this patch, or could be removed? They seem to be causing a lot of problems.

Well. I must admit, I've hit a wall trying to get Binary to work.

I am pretty sure GADTs can be dropped at the cost of losing some type-safety.

But I hoped we'd be able to keep GADTs because they can be handy in other oracles too.

I guess, it's all up to GADTs and Oracles compatibility. Which in turn means, you need Binary, Hashable and NFData instances. Which are almost free if you have Generic. Yet for GADTs we don't have generic and just opened the floodgates :-)

It looks like even @ndmitchell, the Oracle Master, is not sure how to cross oracles with GADTs.

So, here is a proposal: we simplify this branch by getting rid of GADTs but plan to bring them back at some point in future to improve type-safety.

Thus, we'll just oracle (pkg, stage) -> pkgData?

Yes, to the first approximation. Then we'll have to oracle separate fields to avoid unnecessary complete rebuilds when only one field has changed.

Ahh that was the motivation of the GADT. I didn't see that. But then, I seem to miss something about the effects of oracles. Thus the idea was to oracles each field for each package and stage separately to have a fine granular oracles and use GADTs to enforce type safety. 💡

I can try to give this another shot tomorrow, and check what the ghc-cabal branch looks by then.

Thanks! I'll do some experiments in the meanwhile.

Alright, so it doesn't yet compile, but I think I'm making some headway to get it to compile soon.

It compiles! But it does not yet fully work!
I'm getting as far as

Linking /Users/angerman/Projects/Haskell/ghc/shake-build/.shake/build ...
Reading shake-build/cfg/system.config...
/----------------------------------------\
| Copy file                              |
|     input: utils/hsc2hs/template-hsc.h |
| => output: inplace/lib/template-hsc.h  |
\----------------------------------------/
/---------------------------------\
| Copy file                       |
|     input: settings             |
| => output: inplace/lib/settings |
\---------------------------------/
/--------------------------------------------------------------------\
| Copy file                                                          |
|     input: includes/dist-derivedconstants/header/platformConstants |
| => output: inplace/lib/platformConstants                           |
\--------------------------------------------------------------------/
shakeArgsWith   0.002s    2%  =                        
Function shake  0.041s   45%  =====================    
Database read   0.001s    0%                           
With database   0.000s    0%                           
Running rules   0.048s   51%  =========================
Total           0.092s  100%                           
Error when running Shake build system:
* inplace/bin/hp2ps
TODO

See here: https://github.com/snowleopard/shaking-up-ghc/compare/ghc-cabal

@snowleopard should I turn this into a PR? So we can discuss it in the PR? (I think we can still all commit to ghc-cabal then :)

@angerman Great, thanks a lot! Yes, please turn this into a PR for ghc-cabal branch.

Alright there we go. To be quite honest, I'm not sure I have any idea what I did ;)

Anyway, when trying to fix hp2ps like so:

    | pkg == hp2ps = error "TODO"
    | pkg == hp2ps = do
        includes <- interpretPartial target $ fromDiffExpr includesArgs
        pure $ emptyPackageData
            { pdCSources = [ "AreaBelow.c", "Curves.c", "Error.c", "Main.c"
                           , "Reorder.c", "TopTwenty.c", "AuxFile.c"
                           , "Deviation.c", "HpFile.c", "Marks.c", "Scale.c"
                           , "TraceElement.c", "Axes.c", "Dimensions.c", "Key.c"
                           , "PsFile.c", "Shade.c", "Utilities.c" ]
            , pdDepExtraLibs = ["m"]
            , pdCcArgs = includes
            }

in Internals.hs

I ran into the issue of having no idea yet how to translate PROGNAME.

A few things to note:

  • the build file was altered to include the Cabal library from the ghc tree (assuming ghc, is one level up of shaking-up-ghc). This is needed, as the cabal library has changed quite a bit.
  • I have not yet deleted Data.hs, as we might want to keep it around for referring how to build the non-cabalized tools.
  • I have not yet dropped ghc-cabal.

Any comments on the PR are welcome, let me know how bad I did :)

@angerman Well, if it compiles you did much better than I, that I can tell already 👍 I will review the code tonight.

the build file was altered to include the Cabal library from the ghc tree (assuming ghc, is one level up of shaking-up-ghc). This is needed, as the cabal library has changed quite a bit.

This feels awkward, but I see the point. Is the plan to revert to using the bootstrapping Cabal library at some point in future or will we always depend on the in-tree Cabal?

P.S: Oh my, we created a monster! 👹

[  1 of 168] Compiling
[...]

This feels awkward, but I see the point. Is the plan to revert to using the bootstrapping Cabal library at some point in future or will we always depend on the in-tree Cabal?

Ideally, yes. But I couldn't be bothered trying to revert the componentId changes back to package and library id. I tried to use the most recent cabal from hackage, but even that was missing some of functions. As ghc has cabal as a submodule, ghc-cabal has to stay in sync with the external cabal library. I hope we can stop using the in tree cabal, once code like this:

display (localCompatPackageKey lib)

works with stock cabal.

P.S: Oh my, we created a monster! 👹

Well, we only shifted the compilation into the build system instead of into the ghc build itself. It does mean though, that you build the build system once and ghc multiple times, so I'd say it's especially
time saving, when building ghc multiple times. For building the build system + ghc, I assume there's
not much difference yet. (Except for not having to deal with package-data.mk anymore)

I'll see if I can move over more of the missing libs today.

Now hitting the following problem:

Error when running Shake build system:
* inplace/bin/ghctags.exe
Run the 'configure' command first.

I guess we are missing the behaviour which run configure when executing ghc-cabal previously.

Well, we only shifted the compilation into the build system instead of into the ghc build itself.

Yes, sure. It's just impressive to see so many files being compiled. I never thought we'll hit a 3-digit number...

@snowleopard - I've written a 500+ Shake build system before - you have a while to go yet :)

Debugging this is really painful. @ndmitchell is shake missing a putDiagnostic? I assume I'm just doing it wrong, as putWhen is not exported.

@ndmitchell We might need to import the rest of GHC to catch up ;-)

@angerman I typically simply use putBuild or putSuccess for debug printing. Why do you need a special function?

@ndmitchell We might need to import the rest of GHC to catch up ;-)

I think we are on the right path here. Nirvana is near.

@angerman I typically simply use putBuild or putSuccess for debug printing. Why do you need a special function?

So we can leave it in, and have it only show for --trace.

  -V, --verbose, --trace      Print tracing information.

I think it might be helpful, if we don't keep adding trace statements and then remove them again, so the next person adds them back in :)

@angerman OK, I see you point. I didn't want to pollute the code with tracing statements and always cleaned up my debug prints afterwards. However, we may indeed consider adding proper tracing into the build system. I believe this calls for a separate ticket to discuss possible designs.

@snowleopard Right. I'll open an issue. The core idea was that make supports tracing as well, and it might even be useful from time to time.

@angerman: You definitely don't want diagnostic on most of the time, it's insanely verbose. There is chatty above Loud, which might be an option - your call. Adding your own putChatty seems reasonable.

@ndmitchell that's why I was wondering why putWhen was not exported from Development.Shake.Core. As that would allow to use the same output methods.

@ndmitchell ohh and of course, I assume tracing to be only on if there is a serious issue with the build system, when trying to figure out what goes wrong where.

E.g. when one would be running make -j8 -k --debug=b --debug=m TRACE=1 for example.

I suggest you turn on your tracing at Chatty level. Adding putChatty is plausible, but I suggest you add it now - the whole putLoud functions seem to be causing confusion in Shake already so I may rethink them a bit.

In response @snowleopard's thoughts at the top, I will observe (without really applying it to this instance) than all build systems are full of gunk that should be removed. To quote from http://neilmitchell.blogspot.co.uk/2014/07/converting-make-to-shake.html:

However, resist the temptation to "clean up" these pieces during the conversion

Not saying it should apply here, but be careful, convert and then refactor is sometimes easier than both at once.

@ndmitchell I agree. At the same time I think we can say that we already did the conversion step successfully: we copied the exact process with ghc-cabal and package-data.mk files from the old build system and it works. But now users are demanding the next step, refactoring, as several other issues are converging onto this one.

I've changed the milestone to tree-tremble. It looks like we won't settle this issue before first-shake, as I'd like to release it this week.

Hey Cabal dev here. Are there things we can do to help? I think this is definitely a place where we can take changes to the Cabal library to make your life easier.

Hey @ezyang!

Sure, any help would be great. Basically: we'd like to integrate Cabal into Hadrian in a more natural way: as a Haskell library, instead of calling ghc-cabal executable. That would greatly simplify Hadrian, get rid of stringly-typed oracle-ified interface to ghc-cabal, and will allow to optimise (re)builds, e.g. when there is no need for re-running configure scripts. If you'd like to try this I'd be happy to help make sense of the Hadrian codebase.

I feel like we may have had this discussion before, but if you access Cabal directly as a library, how are you going to make sure Hadrian gets built against the most recent (bootstrap) version of Cabal? Assuming you have the correct version of Cabal, I think you just want a Shake rule for the configure step (not quite just configure though because we also need to run ./configure scripts), and then read out the things off of LocalBuildInfo. If you have a story for bootstrapping then I can probably do this part.

A more ambitious thing to do is completely Shake-ify Cabal's build system, and then slot in those rules into Hadrian. Probably not now!

how are you going to make sure Hadrian gets built against the most recent (bootstrap) version of Cabal?

We already use Cabal library (see src/Rules/Cabal.hs) to determine GHC package dependencies, and we build Hadrian using the in-tree Cabal by passing -i../libraries/Cabal/Cabal. So, I think this is a solved problem already. Have a look at *.sh, cabal.project, hadrian.cabal and stack.yaml files.

I think you just want a Shake rule for the configure step (not quite just configure though because we also need to run ./configure scripts), and then read out the things off of LocalBuildInfo.

Yep, this sounds right.

So, I recently circled back to thinking about this problem.

The stated reason why we can't eliminate stage0/stage1/stage2 directories is because ghc-cabal hard codes where it places package-data.mk files. But I don't see any reason why we couldn't add a new flag to ghc-cabal to output the data, e.g., to stdout, where we could slurp it up directly, solving point (1).

I reviewed the PR at #166 (is there more significant code?) and I did not see any reason to believe that issues (2) or (3) could be addressed without significantly expanding the scope of the Cabal changes. Indeed, for example, see #170 for an example of where you MUST modify the Cabal library code, as it makes assumptions about the current working directory which you cannot easily manage in process. Another example is the calls to ./configure: that is managed by autoconfUserHooks and if you want to do caching you are going to have to drill down into this code.

To be clear, it would be very nice if we could rewrite Cabal's build system (in the Cabal library) in Shake. It's something that I've had my eye on for a while. But in my eye, it makes sense to shave that yak first, and then integrate those rules with Hadrian (we'll have to work a bit to make sure the two build systems compose.)

@ezyang Thanks for your input!

But I don't see any reason why we couldn't add a new flag to ghc-cabal to output the data, e.g., to stdout, where we could slurp it up directly, solving point (1).

Yes, something like this is indeed possible, although ghc-cabal outputs not only package-data.mk files but also some autogen stuff. So, the new flag to ghc-cabal could specify the output directory for created files. Note though that running configure scripts in-tree causes other files to be built in-tree, which is still undesirable.

I reviewed the PR at #166 (is there more significant code?)

Yes, I believe this is the biggest effort on this.

Indeed, for example, see #170 for an example of where you MUST modify the Cabal library code, as it makes assumptions about the current working directory which you cannot easily manage in process. Another example is the calls to ./configure: that is managed by autoconfUserHooks and if you want to do caching you are going to have to drill down into this code.

Oh, I didn't realise that. So, you are saying the only way we can solve this issue is to change the Cabal library itself? That's definitely not what we had in mind.

To be clear, it would be very nice if we could rewrite Cabal's build system (in the Cabal library) in Shake.

I am a bit confused here. In this ticket we do not want to pursue this goal. All we want to do is to teach Hadrian to extract package metadata from .cabal files automatically, and perhaps occasionally run configure scripts of some of the packages that require that. We do not want to build anything using Cabal library, all building is (or rather should be) taken care of by Hadrian!

I am a bit confused here.

@snowleopard the idea is rather than Hadrian scraping text files, it could link Cabal directly and use it to generate shake rules---the same shake Cabal it itself would use. Very elegant!

Also see haskell/cabal#4047 . The plan there would teach cabal-install about different ghc's for purpose's of cross compiling, but the same logic is equally applicable for bootstrapping. A shaked-up Cabal naturally leads to a Shaked-up cabal-install, yielding the same slick integration as described above. [I believe it's also a goal to factor out the solver from cabal-install as its own library.]

the idea is rather than Hadrian scraping text files, it could link Cabal directly and use it to generate shake rules---the same shake Cabal it itself would use. Very elegant!

@Ericson2314 I see, this would be great and elegant, but how soon is it going to happen? Do we want to depend on this future Cabal feature in this project? My intuitive answer is No.

I think you (and @ezyang) are saying that it just doesn't make sense to invest time in solving this issue until we have a better Cabal library, which does sound reasonable. Perhaps, we shouldn't bother indeed.

This issue is not on the crititical path for Hadrian to replace the old build system, so I'm fine if we decide not to pursue it until it becomes trivial to resolve with shakified Cabal.

I'm not in a position propose whether a stopgap is worth it, just wanted to point out that eventually things can be really really slick.

Do we want to depend on this future Cabal feature in this project? My intuitive answer is No.

Do you mean depend on linking Cabal before Cabal is rewritten with Shake? I agree that would be silly. I assume from the rest of your post that you would like to use it once Cabal is so rewritten.

Do you mean depend on linking Cabal before Cabal is rewritten with Shake?

@Ericson2314 Sorry, I didn't express myself well. I meant we don't want to wait for the shakified Cabal before releasing Hadrian as the main build system of GHC, i.e. we don't want to depend on it just yet.

I assume from the rest of your post that you would like to use it once Cabal is so rewritten.

Yes, it sounds promising.

All we want to do is to teach Hadrian to extract package metadata from .cabal files automatically, and perhaps occasionally run configure scripts of some of the packages that require that.

Right. And unfortunately, to get the metadata, we have to run the equivalent of a ./Setup configure step, which is a big blob of code which, among other things, assumes that the CWD is the same directory as the package.

I see, this would be great and elegant, but how soon is it going to happen? Do we want to depend on this future Cabal feature in this project? My intuitive answer is No.

Yes, it is not going to happen for a while, so don't block on it. But on the other hand, I don't think it is an insurmountably complex project (ignoring Custom builds, for the moment.)

I might make an attempt at this over Christmas break. One big question I have, though, is how to setup any such build system so that it can be integrated with Hadrian. Hadrian has its own DSL going on for assembling command line calls and I wonder if that should be factored into a library that cabal-shake could use.

One big question I have, though, is how to setup any such build system so that it can be integrated with Hadrian. Hadrian has its own DSL going on for assembling command line calls and I wonder if that should be factored into a library that cabal-shake could use.

@ezyang Ah, yes! We do want to factor out the generic DSL part from Hadrian into a separate library. I didn't prioritise this but I can look into it soon (December-ish) if you would like to use it.

It's not that urgent on my end; what I will probably do if I actually attack this problem is to just copy paste some of the low level combinators into my implementation, with the idea of dropping it for the library later. ;)

This idea came up again in #395.

I have confirmed with @dcoutts that lib:Cabal indeed assumes that the cwd is the the root of the package being built. It doesn't sound like this is going to change in the near future.

Done in #531 🎉