haskell/hie-bios

Proposal: better multi component support (including `show-build-info`)

pepeiborra opened this issue · 12 comments

The ghcide README says:

There is a known issue where if you have three components, such that A depends on B which depends on C then if you load A and C into the session but not B then under certain situations you can get strange errors about a type coming from two different places. See this repo for a simple reproduction of the bug.

This is because of the way multicomponent support works in ghcide. Loading A and C creates a fake package AC which results from the union of the flags and targets in A and C. So AC depends on B because A depends on B, and transitively on C because B depends on C. As a result the types in C are loaded twice, once for AC and once for B, and even if they have the same names they are not the same types, which leads to the weird errors.

To fix this, we need to load B in the session, obtaining ABC. The user can do this by opening any B file in their editor, but ideally this would be done automatically by ghcide. Unfortunately, ghcide has no way of knowing about this!

My proposal is to extend getComponentOptions to accept the set of components already loaded and produce a set of component options:

getCompilerOptions :: FilePath -> [LoadedComponent] -> Cradle a -> IO (CradleLoadResult [ComponentOptions])

This would change the example in the ghcide README as follows:

  1. The user opens A.hs from A in the editor,
  2. ghcide asks the cradle about A.hs letting it know that the current set of components is []
  3. hie-bios returns the component options for the set [A]
  4. ghcide creates and loads the component A
  5. The user opens C.hs from C in the editor
  6. ghcide asks the cradle about C.hs letting it know that the current set of components is [A]
  7. hie-bios returns the component options for the set [B,C]
  8. ghcide creates and loads the component ABC

To implement the new getCompilerOptions:

  • the Stack and Cabal cradles will need to interrogate the build system to find out component dependencies
  • multi cradles will need to interrogate all the single cradles, which will probably be ridiculously expensive
  • Bios cradles will need to extend the protocol with the program to pass the information of the loaded components

This proposal is very much up for discussion, and note that I have not prototyped any of this so I have no idea of whether it will work.

@OliverMadine I think this issue summarises the idea succinctly.

Changing hie-bios' API is relatively trivial, no real deal here.

The main challenge is to extract the required information from cabal and stack. The information we need is basically, given a source file, we need to know how we can compile the given file (ideally the whole component the file belongs to). For multi-component support, we need the information for all components at the same time, e.g. what are the components and what GHC options do we need to compile them.

For cabal, it was envisioned to have a command show-build-info that can give us all the required build information. The original PR has been partially merged into Cabal (the library) and for our use-case, we would need an interface for cabal-install (the executable).
I am working on rebasing PR #6241 here: https://github.com/fendor/cabal/tree/cabal-sbi-expirement, but it is currently buggy.
It would be great to get the PR into shape again and get it merged.

For stack, we need a similar interface. There were several discussions but afaict there exists no real prior work (cc @jneira whether this is correct).

Since implementing proper commands into stack and cabal is a lot of work, there are additional stepping stones (temporary workarounds):
Instead of commands, use the available information from hie.yaml to extract component information. E.g. if you have a multi-cradle we could just assume it lists all available components. The given information should be sufficient to obtain for each component the compilation options.
With tools such as implicit-hie and cabal-hie, we can generate multi-component hie.yaml files in HLS directly.

For stack, there is also the sub-command stack ide targets, which lists all available targets for stack projects. When we have all targets, we can query for each target/component the options. Unfortunately, there is nothing comparable for cabal.


Generally, I am rather opposed to any temporary hack (as hie-bios is already a huge hack) and would prefer proper solutions in the build-tools, e.g. commands for cabal and stack. However, since we don't have a solution for years now, (and a rename feature in HLS would be awesome) I don't hold too tightly onto this ideal.

Feel free to ping me here or on IRC about unclear statements and details.

cc @wz1000 did I forget something?

@fendor
Thanks for the clear write up, I don't have any questions so far.
Sorry for the late reply.

@chrisdone

I think it is better to discuss the json format here than in haskell/haskell-language-server#1.

Currently, the build-info per component produced by cabal looks like this:

{
  "type": "lib" | "exe" | "bench" | "test" | "flib",
  "name": "String",
  "unit-id": "String",
  "compiler-args": ["String"],
  "modules": ["String"],
  "src-files": ["String"],
  "hs-src-dirs": ["String"],
  "src-dir": "String",
  "cabal-file": "String"
}

That is enough for compiling a component.
Additionally, it would be nice if we had a way to query for common metadata such as project ghc version, etc...

Currently, we have:

{
  "cabal-version": "String", /* lib:Cabal version */
  "compiler": {
    "flavour": "ghc" | "ghcjs",
    "compiler-id": "String", /* normalized, ghc-8.10.2 means ghc with version 8.10.2 */
    "path": "String"
  },
  "project-root": "String",
  "components": ["ComponentInfo"]
}

My question to you would be: do you think this file format makes sense or is anything missing? E.g. it might make sense to have component dependencies, such as package.yaml or stack.yaml.

No such library exists yet.

@fendor that should be a lightweight library, do you think it could be created and used in the actual cabal pr, to being able to start make progress in the stack side or it would be better to have cabal support first?

Since s-b-i resides in lib:Cabal, I don't think we can easily add another dependency, and exe:cabal does not really care about the format.

However, there are decoders in the cabal testsuite, I'll factor them out in another library and link the repository here again.

nice, i think such a library could help to consumers of cabal and stack s-b-i output too so its existence worths

Factored out the json encoders/decoders: https://github.com/fendor/cabal-build-info/

Additionally, added some type safety. Interface up for discussion!

Awesome, thanks @fendor. The types with the JSON interface look great. So I can update my stack PR to depend on this library and generate these types to stdout.

I’ll have to check more closely when I’m back at computer.

As for a temporary solution to this issue, I'm not sure that I can do anything better than just loading all of the components listed in the hie.yaml.

This seems relatively simple to do by changing some of the code involving 'selecting a cradle' from the multi-cradle.

I have this working, but it is slow for big projects, so it may be best that we wait for a full solution?

cc @fendor, @pepeiborra

I think this will always be slow, since cabal build is very slow. In my opinion totally okay if it is a bit slow, since we hope that follow-up startup are quicker.