Add config flag to control generation of .ghc.environment files
hvr opened this issue Β· 124 comments
Some people are excited about .ghc.environment files, while others are mildly annoyed. I think there's probable cause to make this configurable on cabal's end.
To this end, I suggest (modulo bikeshed) to implement a flag, e.g. --pkg-environment-scope=LEVEL, where LEVEL can be one of
all(default & current behaviour, generate ghc env files with all transitive dependencies of project's (non-qualified) goals)- build-target (e.g.
:pkg:Cabalorlib:CabalorCabal:test:parser-tests): generate ghc env files containing only the stated goal's /direct/ dependencies) -(ornoneoroffordisable?): disable generation of any ghc env files
This flag would also be persistable via cabal.project.(local).
Note: Item 2. is not the short-term goal of this feature request! I've included the 2. item mostly to motivate why it makes sense to design this to be a flag that's more than merely a boolean --{disable,enable}-package-environment-files flag.
Btw, a dual feature-request (allowing to opt-out from interpreting .ghc.environment files from GHC's side) has been filed at GHC #13753 -- however, that one will be too late for GHC 8.0.2 users.
Current status:
- Starting with GHC 8.4.3, ghc is more verbose when picking up pkg envs (see http://git.haskell.org/ghc.git/commit/00049e2dce93b1e468c3fde3287371eb988aafdc)
- Starting with GHC 8.4.4, ghc supports the special
-environment which inhibits the default lookup logic (see http://git.haskell.org/ghc.git/commit/8f3c149d94814e4f278b08c562f06fc257eb3c43)
/cc @RyanGlScott
So, I've recently been working in the Python/Anaconda world, where they have scripts like source activate myenv which kick you into an alternate shell where when you call python, you get the environment (interpreter, packages, etc) correspond to the particular environment you are working on. Letting this setting be done on a per shell basis seems better to me than setting it to file path location? (Yes, I know this is totally asking GHC environments to be redone, but it seems to me like it would solve the root cause of the problem you're describing above, as well as other problems.)
Tbh, I have been working with Python's virtualenvs for quite some time, and they were somewhat unsatisfying to me. I prefer the git-style approach which has been copied by many tools which are $CWD sensitive. I don't want to have to remember and having to explicitly perform redundant keystrokes to enter an environment other than simply cding into a project, and have my tool follow the DWIM principle. If we require to explicitly enter environments, we still have cabal exec or cabal repl; but that's a different paradigm (which has its uses) from what GHC environment files provide. Since GHC environments started working in cabal head, I've reduced my use of cabal new-repl to quite a lot less.
This ticket is primarily to help those who don't subscribe to that paradigm, and want to opt-out.
I just spent a significant chunk of time trying to get a project to build and run tests, using stack. I build the project using both cabal master and stack, to check that it works for both.
Having deleted the dist* directories and built with stack, the tests failed, complaining about a missing package db in ./dist-newstyle.
After re-installing everything haskell related, I eventually tracked it down to this file being silently generated, and then being picked up by other tools.
At this stage I would prefer to opt in for this generation, or at least get something in the output to say that the file has been generated, so that I can know that a potential landmine has been planted.
This is bound to trip up others as well, as the existence and use of this file is not well documented.
I agree that this state of affairs is not great. @hvr, can we perhaps change this setting to be opt-in?
Can we get stack to turn off parsing of .ghc.environment files with -hide-all-packages?
/cc @mgsloan
fwiw nix also breaks in some cases due to .ghc.environment.* files so I donβt think that a workaround in stack is the right solution here.
As per https://www.reddit.com/r/haskell/comments/8iyvoo/psa_for_cabal_22_new_users_regarding/ I think that the weight of opinion is that these files should not be auto-generated, but only by opt-in.
If we add this flag as a global flag that can be in ~/.cabal/config and off by default then that would help. Then if people wanted such a config, they could run cabal new-configure --package-environment-scope=ALL.
I think that would improve the situation for a whole bunch of workflows...
@gbaz quite honestly, I'm strongly against defaulting it off, as then nobody would even use what I consider one of the biggest improvements over the old flawed user pkg-db model, since almost nobody would know about it ;-(
Before changing anything, I'd rather want to know exactly what workflows we're talking about that people are having problems with, as I suspect the issues are of a different nature...
Fwiw, there's already patches up at
to address some of the concerns with the UI
Either way, we should have the flag so it can be changed. Lots of people don't want it on, and at the minimum they should have the option. We can sort out where the actual weight of opinion lies regarding defaults in a broader discussion.
To expand: my personal belief is that it is currently unexpected behavior for any new-* command to produce any changes to anything outside of dist-newstyle and the store (except for install, which obviously installs), which is what this does. So it violates least-surprise to run e.g. new-test and then suddenly find your invocation of ghci doesn't do what you expected.
Moreover, this affects more than just ghci. I often invoke runghc ../haskell-ci/make_travis_2.hs from within a project to generate a .travis.yml file, but have it break due to some .ghc.environment file polluting my package database with different packages than the ones that haskell-ci expects. I end up having to explicitly navigate to another directory to avoid .ghc.environment files from breaking the script.
Ironically, despite the stated goal of having GHC commands "just work" regardless of the current working directory, in practice they have the exact opposite effect. I have to be hyper-aware of which directory I'm in before I invoke runghc now, since picking the wrong directory can ruin any Haskell script I may wish to run.
The opt-out flag in ghc implemented in one of the PRs linked above solves that use case. The main concern with only having that that I would have is if there's a command invoked that invokes ghc for you, and thus where it is sort of a pain to thread that flag to.
@gbaz First off, I consider the current behaviour the intuitive one; if I'm in a project, I expect all invocations to be operating in that context and not some other random pkg environment which has nothing to do with the project I'm in; so I'm definitely not confused at all that ghci throws you in the current CWD -- I'd rather be annoyed if this wasn't the case, and always had to remember doing some magical invocation in order to place myself into that context; and fwiw, that's also how other tools like Git operate; For tools intended to be used by humans on a shell interactively, this is a very common and established UI idiom.
Otoh, programs which invoke ghc for you will have to stop making any assumption about the implicit package environment; same goes for programs which invoke git for you. And I haven't seen anybody making any drama out of tools like Git behaving this way.
Ryan brings up the issue of runghc; Personally I've moved to using runghc in the same style I use ghci and ghc -- expecting it to be CWD sensitive. runghc-based scripting is unfortunately a weakly defined interface which operates in the old paradigm of having a stateful global+user package db in scope, which was rather fragile to use (and which new-build was designed to get rid of). It had a lot of issues already with old-build (it's integration with cabal sandboxes was also extremly sub-optimal; and which package environments elegantly adress), and for new-build we may want to redesign how scripting is done, c.f. #3843 and/or find other ways to integrate runghc with profiles. And as for haskell-ci; that script is about to drop its original goal of being a project-less single-module runghc-friendly script, as its becoming too cumbersome to maintain such a huge monolithic module that way (besides that shebang incompatiblity on some platforms...) and the need arises to start using libraries which are not GHC boot libs. So for that you'll have to switch instead to something like having a wrapper script make-travis-yml in your $PATH like I have e.g.:
#!/bin/bash
exec cabal --project-file=${HOME}/Haskell/projs/haskell-ci/cabal.project new-run exe:$(basename "$0") -- "$@"
@hvr I've personally experienced many cases of users coming to me asking why their project doesn't build, only to find its because of these files that they didn't notice because they begin with .. This even broke brittany for me, a code formatter that doesn't even invoke GHC as a process or require any packaging information, which took me a nontrivial amount of time to figure out. I think people's expectation of GHC is that changes to its behavior are not to be made implicitly. This is the expectation for virtually all compilers, i.e. C compilers require every bit of configuration to be done on the command line. Every time I want to deal with package databases, I expect cabal to be involved.
Furthermore, this isn't even a particularly effective solution to the problem it's supposedly fixing, because these files can and will go stale. You have to run a cabal command anyway to update it, and who would choose to have to keep track of that in their head?
It's safe to say that in my experience, this is extremely unintuitive, and that seems to be the consensus in this thread. Having this be opt-out would still leave me with virtually just as many confused newcomers asking for help.
require every bit of configuration to be done on the command line
...
Every time I want to deal with package databases, I expect cabal to be involved.
I honestly doubt you even understand how utterly unusable ghc would be on the shell if we did that (and also how much GHC already doesn't meet your unrealistic ideal), and how frustratingly annoying new-build would be for those that want to use ghc/ghci w/o having to remember to setup env-vars, passing tedious cli flags, or invoking cabal. This would be even worse than the inconsistent situation we had before new-build. We're still in phase of paradigm shift, moving away from the old-build paradigm, into the new-build paradigm. This requires some new approaches, and we're still figuring out how to optimise the UI. But you're effectively suggesting throwing the baby out with the bathwater before it even had a chance, rather than trying to improve the UI to provide a next-gen seamlessly integrated and convenient shell experience for ghc. I'm dogfooding this feature as much as I can, and I'm quite happy with it; and now the time has come that we need more constructive feedback from more users. Therefore I'm more interested to hear how we can explore and improve this new paradigm, while new-build is still prefixed by new-* and can more easily justify radical UI changes.
I don't think it's considered a high priority among most people that they be able to invoke ghci rather than cabal new-repl. It doesn't seem like a particularly important ability to me.
Obviously GHC isn't only configurable on the CLI. I understand that it wouldn't be realistic to uphold this standard 100% (in fact it's quite annoying that you can't configure more of the C compiler outside the CLI). But I think it's an important principle to keep in mind, and this seems like a particularly egregious deviation.
All in all, I'm willing to accept that others do consider this valuable, so you'll get no uproar from me if this goes unchanged. But it would be my preference that cabal didn't generate these files implicitly.
If we don't generate them implicitly, perhaps it could be a cabal subcommand to generate them? Users very often look at help text, so having a subcommand makes them very likely to be aware of the existence of this feature.
To be fair, there is an important semantic difference between invoking cabal new-repl and invoking ghci in the presence of a .ghc.environment file. The former will always rebuild the entirety of the current project in interpreted mode, whereas the latter will simply pop you into a shell with the project already built.
I want to make it clear that I find both features extremely useful, and I'm not advocating that either one of these be removed. On the contrary, I would like to be more explicit about which ones I'm using at a given time. Currently, the latter is quite easy to use by accident, which can have unintended consequences. Luckily, some progress is being made towards making this less likely, which is nice.
As I've stated (and others have stated) before, my preference would be to go from opt-out to opt-in. But that suggestion seems to have fallen on deaf ears, so I'm not going to pursue this much further.
I too would appreciate a dedicated subcommand that simply generates a .ghc.environment file and nothing else. AFAICT, the only way to do so at the moment is through a side effect of a different cabal new- command, which seems suboptimal to me.
appreciate a dedicated subcommand that simply generates a .ghc.environment file and nothing else
In order for a project's generated .ghc.env file to be consistent, you need to have performed a cabal new-build operation; it makes no sense to generate a .ghc.env file w/o having constructed a build-plan and having populated the referenced libraries into the nix-style stores; furthermore, in order to keep it consistent, we need to hook into every action which may result in the build-plan being changed, in order to have it regenerated. So having merely a one-off way to generate these environments for a project would be very bad UI-wise, as you'd run very easily into inconsistent states; once you go package environments, you need to keep doing it.
OK. In that case, it would be nice if the cabal new-build --help output mentioned that it does this (or better yet, provided a flag which toggles the generation of .ghc.environment files on/off). Currently, this information is not very discoverable (except by accident when things break).
How do we reach consensus from here? It seems like we've only got one testimonial in favor of opt-out, vs several in favor of opt-in.
I honestly doubt you even understand how utterly unusable ghc would be on the shell if we did that, and how frustratingly annoying new-build would be for those that want to use ghc/ghci w/o having to remember to setup env-vars or invoking cabal. This would be even worse than the situation we had before new-build.
Indeed it is clearly possible that I too don't understand this. And I may misunderstand what either of you meant by "configurability on the commandline/of the CLI". Could we just spell things out? What input does GHC(i)'s behaviour depend on?
From the top of my head I can think of
- commandline args,
- environment bindings such as
PATH,GHC_PACKAGE_PATH,HOME INSTALLPATH/lib/ghc-x.x.x/settings- the installation ("global") package db
.ghcifiles incwdand in$HOME.ghc.environment.*files
Is this list complete? Does a complete version exist (in the ghc user-guide)? I just searched for "ghc.environment" (and also just "environment") and could not find anything, yet this seems highly relevant for general reproducability.
But even if the environment files are just an addition to some long list, it appears to me like most other inputs are either
- explicit (1)
- very common and expectable (2, excluding perhaps
GHC_PACKAGE_PATH) - mostly constant/an implementation detail (3, 4)
- explicitly created by the user (5)
and in general i'd say 1-5 are all either are explicit or change only when the user (knowingly) fiddles with stuff. An implicit, frequently changing, new input is rather different imo.
Furthermore, this isn't even a particularly effective solution to the problem it's supposedly fixing, because these files can and will go stale. You have to run a cabal command anyway to update it, and who would choose to have to keep track of that in their head?
To be fair, there is an important semantic difference between invoking cabal new-repl and invoking ghci in the presence of a .ghc.environment file. The former will always rebuild the entirety of the current project in interpreted mode, whereas the latter will simply pop you into a shell with the project already built.
Granted both ghci behaviours are useful, this is hardly a reason to not just have two versions of cabal (new-)repl (or a flag or whatever). This path would avoid implicits, could properly handle the staleness case, at the "cost" of requiring the user to invoke cabal for a task that in my view seems to fall under cabal responsibility, not GHC. So I don't see the downside here, apart from the annoying aspect of having to re-think the UI around this.
(GHC is not supposed to know about projects in any way, and I would not expect it to be "clever" in this direction. cabal also (indirectly) invokes gcc, but I would never expect to be able to call gcc and automagically get any haskell-specific behaviour.)
How do we reach consensus from here? It seems like we've only got one testimonial in favor of opt-out, vs several in favor of opt-in.
This discussion is very clarifying about the different attitudes involved. I think the basic question is not about cabal per-se but rather how ghc should interact with in-path "config" files of various sorts (currently mainly environment files, but also ghci files, and in the future potentially others). Some people expect ghc to behave the same no matter what directory it is run in, and some people expect it to operate in a context-sensitive fashion.
Ideally we could have a solution that satisfies everybody, and ultimately I think that solution is about ghc primarily, and only secondarily about cabal. It is really just cabal auto-generating these env files that highlights the different expectations about ghc user experience.
My suggestion would be a plan and discussion regarding ghc, perhaps ultimately resulting in a ghc proposal, to clarify the principles of how the compiler interacts with various configuration files. Which is to say, if people who don't like ghc picking up env files could configure ghc to not do that, then the question of cabal generating them or not would not arise.
I have a draft proposal about dot-ghc-files (https://github.com/gbaz/ghc-proposals/blob/patch-2/proposals/0000-dot-ghc-file.rst) that I did not yet submit, in part because I hadn't worked out the interactions with or relationship with env files. I can try to work up different version that takes this discussion into account, and we could proceed from there...
This discussion is very clarifying about the different attitudes involved. I think the basic question is not about cabal per-se but rather how ghc should interact with in-path "config" files of various sorts (currently mainly environment files, but also ghci files, and in the future potentially others). Some people expect ghc to behave the same no matter what directory it is run in, and some people expect it to operate in a context-sensitive fashion.
This has also been the gist, I've taken from this. With a slightly different conclusion though.
I almost always prefer non-magic over magic, and the environment files fall clearly into the magic category. @hvr's GHC patch to make it more explicit that they are used in ghci, improves the situation, but it only addresses the symptoms, not the magic.
I'd much rather see
$ cabal ghci # ghci with environemnt files
$ cabal ghc # ghc with environment files
$ ghci # stock ghci, no env files
$ ghc # stock ghc, no env files.
This would (at least to me) read like:
$ <context> <cmd>
If I want to run ghci in the cabal context, I'd run cabal ghci.
On the implementation side, I'd suggest that ghc learns a flag -use-env-file and
cabal ghci would just invoke ghci -use-env-file.
There already is a -use-env-file flag, named --package-env. Cf the docs.
Having cabal and other build-tools pass this flag, and ghc never auto-pick up anything seems a valid bikeshed color. But so too does a global ghc option as to whether ghc should auto-pick things up, set in a potentially minimal global ghc config. That way everyone can choose their preferred workflow, instead of needing to slug it out over the one "true way" :-)
@gbaz I'm not opposed to the idea that this is a GHC problem, and that it be configurable in a global GHC config. But FWIW, I do consider it faulty to autogenerate files that are guaranteed to be made stale once the .cabal file is changed. As is, if you want to make a change in the cabal file, you have to remember to re-run cabal new-build before you can rely on the env file. That's a pretty large surface area of new user-error cases. I suppose that's really a different issue, but the solution to it (@angerman's proposal) solves both that problem and the original problem of confusion caused by unknown autogenerated files. It seems far more elegant that you need to invoke cabal to reach your environment. And as long as cabal-install isn't autogenerating these files, I'm not personally bothered either way by whether GHC picks them up automatically.
Regardless, whatever the implementation detail, is it at least agreed that the consensus (w.r.t. unexpected behavior) is on some form of configurability, favoring opt-in rather than opt-out?
@ElvishJerricco but all files cabal generates are stale once a file is changed. Everything in dist-newstyle too for example :-)
It seems to me that one proposed path forward would be to A) create cabal ghc and cabal ghci commands that always explicitly pass local config files, B) always generate local config files, C) change ghc to be configurable on if it picks up local config files or not, with either default being fine by me.
One tweak could be to put generated local config files into dist-newstyle instead, since that 1) maintains the idea that unless you install stuff, only dist-newstyle and the store are affected by cabal commands and 2) means that ghc won't see the generated files in its path automatically, since they won't be in the path, typically :-)
Along with the "default flow" question is herbert's suggestion that unless we make this feature the default path, people won't use it, which would be unfortunate. I.e. he is arguing that we get this important feature more visibility by making it the default approach. I'm less sure of that -- in particular, if it is useful enough, I think that people will seek it out. And in general as we roll out new cool features we'll want good documentation and also blog posts explaining and summarizing how to do things. I don't see a way around user education in general.
However, I disagree with @ElvishJerricco on the idea that we can say what a "consensus" is at this point. The representative sample of opinion in this thread is very small, and necessarily biased towards those who have taken an interest in this issue because it has affected their particular workflow. (And this thread is composed of power-users with particular and idiosyncratic workflows, honestly speaking). So a straw poll here really can't help us much with thinking about what the average user mainly will want, imho. My take with kicking this to a ghc rather than cabal configuration question is first that it really is that, and second, we have a general process for trying to solicit wider feedback in such a case, a ghc steering-committee to kick thorny issues to, etc. So process-wise it seems a better setting to resolve this as well.
@ElvishJerricco but all files cabal generates are stale once a file is changed. Everything in dist-newstyle too for example :-)
I don't know about that. Is a file really stale if it's guaranteed to be updated the next time you try to use it? For the most part, there's not a lot of reason to use anything inside dist-newstyle unless it's via the cabal command that will update it. The env files, however, are not updated by the tool that uses it.
My take with kicking this to a ghc rather than cabal configuration question is first that it really is that, and second, we have a general process for trying to solicit wider feedback in such a case, a ghc steering-committee to kick thorny issues to, etc. So process-wise it seems a better setting to resolve this as well.
The problem with that approach is that in the best case we will get this for GHC 8.6 (and itβs certainly possible that the deadline for that will be missed as well) while GHC picks up environment files starting from 8.0.
I also donβt think that a global GHC option really solves this: Letβs say you are in favor of environment files in general, then there is still a good reason to want to stop a build tool from generating them namely the case where you switch between different build tools and want to prevent one of them from generating environment files. This is certainly not a particularly common usecase but itβs one that I use quite heavily and ideally one that would be supported.
Regardless of what we take away in terms of a consensus on a default (fwiw Iβd be in favor of opt-in), I think this thread shows pretty clearly that different people prefer different workflows and it seems like an option in cabal-install to control the generation of env files could enable this workflows relatively quickly while we can still debate the default and whether there also should be a way to control this via a GHC setting. So I think it would be a shame if the (afaik uncontroversial) feature of adding an option to cabal-install would be blocked by more widereaching and controversial discussions.
@cocreature I don't think there's any debate that it should at least be configurable in cabal-install. The two remaining questions seem to be:
- Should the use of env files be opt in or opt out, regardless of how that is achieved? (Almost everyone is saying opt in.)
- If opt in, then how do we make it opt in? Via making the cabal-install config (which will exist either way) opt in? Or via making the GHC feature opt in?
The config flag for opting out of this should be easy to implement, patches welcome.
Note: There may not even be ghci, runghc and other ...ghc... commands in $PATH.
For instance, I installed cabal-install out of stack, and it is no secret that stack does not expose any single version of ghc, but manages them internally. cabal commands somehow manage to live with this, but I can only possibly run cabal ghci or cabal new-repl, never ghci.
Even if we disregard the existence of stack, once there is one way to obtain such unusual installation, I assume there may be other ways as well.
Therefore, we should avoid informing the design of cabal-install by the assumption that ...ghc... commands are directly available to the human operator.
How about this:
-
new-build,new-repl, etc stop generating package environment files as a side-effect of their primary behaviour. This is an improvement on the status quo because it's impossible to be caught out by the unexpected generation of a package environment file by an operation that is not obviously connected to package environment files. -
A new command.
new-environment, is added to generate a package environment file corresponding to the environment from the last new-style command run. This is an improvement on the status quo because sometimes you do just need to generate one of these files without re-building anything; further,new-buildre-run the solver and hence the previous environment could be unrecoverable if the only way to generate it is vianew-build.
I think this would be a strict improvement: when you need to inspect things using the package environment at the command line, you are able to do so reliably; when you don't need to, you are in no danger of being caught out. This is like the 'opt-in' idea from above, but I think it's potentially important to be able to get the package environment from the previous command reliably, which the current situation achieves.
@quasicomputational as far as I'm concerned it would not be an improvement at all for my workflow. I'm pretty happy that new-build generates a ghc env file as a side-effect. If it wouldn't it'd be in fact a strict regression for me if I had to keep to remember to invoke a weird new-environment command... which would be even more inconvenient than using cabal new-repl and would also risk getting stale/invalid everytime new-build or new-repl is invoked. Thus, not really a good option IMO.
@kindaro I'm not convinced that your argument is sound to be honest, as its premise is IMO very artificial and doesn't match the common ways cabal-install is used (and yes, we should definitely disregard the existence of stack for the purpose of this discussion as it bears no relevance to it and would only risk to become a useless distraction): cabal in its default config is designed to pick up any executable called ghc that happens to be in $PATH. If there isn't you have to supply one yourself via the --with-compiler option. But that's not too important, as the main point here is that cabal UI is heavily designed around the idea that ghc et al are directly available to the "operator" and cabal autoconfigures itself based on the discovered environment that's available to the operator; most importantly, cabal does not manage compiler installation -- it's the other way around. This happens to be one of the things that I've always liked about cabal (and dislike strongly of tools which force you to always go via them to manage & access the compiler thus violating concern separation).
For me, the current behavior (generating the .ghc.environment.* files by default) is preferable. For context, I develop applications and libraries. The generation of .ghc.environment.* files make the ghc and ghci useful. Without these environment files, these two commands are worthless to me. The only contexts in which I've ever seen them be useful in are:
- I'm working in a nix shell that somehow rigs everything up so that ghc/ghci can see all the deps.
- I'm working on a library that has no deps other than
base(this is rare).
Environment file make it usable with nearly everything.
I agree with @andrewthad; I find environment files to be extremely useful and it would be a shame to lose the convenience of having them available as a side-effect of a build.
@andrewthad @bgamari What if you could just do cabal ghc or cabal ghci, as @angerman suggests? (EDIT: In fact, it seems cabal new-exec ghc does the trick already) The difference between invoking GHC directly and invoking it through a cabal sub command seems entirely negligible to me.
In fact, I would expect to have to use cabal to get things managed by cabal. To me, the fact that bare GHC commands don't provide bare behavior drastically violates the principle of least surprise. The number of people who have wasted non-negligible amounts of time tracking down issues caused by this is an indicator of the surprise factor. I'll again cite my brittany example; simply running brittany with one of these files present in any parent directory is likely to fail. Perhaps this wouldn't bother me so much if GHC didn't search parent directories...
That said, I'd be happy if there were a way to turn this behavior off (at least manually) in ~/.cabal/config.
In fact, I would expect to have to use cabal to get things managed by cabal. To me, the fact that bare GHC commands don't provide bare behavior drastically violates the principle of least surprise.
I find this argument convincing. What if by default the way to run ghc or ghci in current project's context would be via cabal exec [ghc,ghci] and then there was a setting in ~/.cabal/config to enable implicit environment file generation?
@ElvishJerricco It's not just an issue of typing cabal ghci vs ghci. How are flags and arguments supposed to be passed to the cabal ghci you propose? Do I need to put a -- in front of everything or do I have a --ghc-arguments='...' with some escaping rules inside the quotes? Currently, everything that wraps ghc in this way has additional rules for passing flags and arguments that have to be remembered. I've always encountered confusion using stack or cabal to exec ghc. Having one command and one source of documentation where all the flags are in one place is less confusing.
My preference is for cabal to continue generating ghc environment files as it has been and to have ghc/ghci continue to pick them up by default. But, I certainly agree that having more configuration options (at the global level) and more warning messages is going to be helpful no matter which way the defaults go.
Placing -- in front of arguments is pretty standard, not just among haskell tooling. It would be very unsurprising.
How are flags and arguments supposed to be passed to the
cabal ghciyou propose?
I think that if we went that route it'd be something like cabal exec ghci -- -foo -bar --baz because of the current limitations of the command-line parser.
@23Skidoo FWIW, exactly what you just typed works today (EDIT: clarification: even after you delete the .ghc.environment file). I would propose having direct subcommands to avoid having to type exec.
Not 100% convinced that saving a few keystrokes is reason enough to add new top-level commands, especially if it's possible to enable implicit environments. I'm +1 on adding support for command aliases though, but that needs a rewrite of the command-line parser.
I would expect to have to use cabal to get things managed by cabal
In that case, you're basically saying that ghc and ghci shall only ever be able to access things in your global package db only. That's not very practical.
Or whatever I explicitly ask it to access, yea. What's not practical about that?
Because this is about ergonomics: When you're in a shell and have to type stuff interactively, you don't want to hurt your fingers having to type long-winded invocations repeatedly and fill your terminal with redundant noise. You want DWIM optimised heuristics. And that includes the ability to manage the default environment that a bare invocation of ghci or ghc throw you into in a stateful manner; or to follow the CWD-sensitive paradigm that tooling like Git follows when you're inside a project folder. Otoh, the cases where you want to be explicit is usually when you're scripting because there you are typically in an editor and usually don't mind the extra keystrokes needed, as you write the script once but execute it multiple times and it amortizes. I consider being forced to use cabal exec ghci -- ... terrible ergonomics and bad UX.
It's not just about ergonomics. It's also about semantics. Though I agree that ergonomics are important, they almost always take a backseat to semantics IMO. The semantics here are not good IMO. Cabal is injecting itself as a dependency of GHC with this behavior; i.e. you must understand a cabal feature (that's not well documented, or at all obvious, I might add) to understand how to use GHC and why brittany just randomly crashed on you. Cabal is making choices for you about how to use GHC outside of cabal, which IMO means it's leaving its jurisdiction.
If the problem is just how many keys you must press to run GHC the way you like, I'd much rather fix the problem by providing cabal ghc shorthands, perhaps that don't need -- and eagerly pass remaining args to GHC. You still have to add cabal to the line, but that's much more minimal than theses semantic decisions, IMO.
CWD sensitivity makes sense with Git because Git is literally a tool for managing your CWD's state. It makes sense for cabal because the metadata provided by the .cabal file is essential to its operation, and scoped to a particular directory, plus the .cabal file is not a hidden dot file and is an obvious center of truth that the user themselves maintains. It makes sense for a variety of tools for a variety of reasons, but GHC by and large isn't one of them.
I'd point out that GHC natively supports
- (legacy) user package db (which is $HOME sensitive) (becomes obsolete with
cabal new-build) - package environment files (which have a $HOME and a $CWD sensitive part) (which is the
cabal new-buildreplacement for the user package db) .ghcifiles (which have a $HOME and $CWD sensitive component)
which of those GHC features would you consider legitimate features of GHC and which should be removed in your opinion (if any)? And also, who should be charge of creating/modifying those files/folders?
I'd point out that I deliberately chose the phrasing "by and large," rather than "absolutely." Obviously I know there are even aspects of GHC that can be CWD sensitive (such as default output file locations). My point was much more about the semantics that we experience because of these implicitly generated and accepted environment files.
@hvr has a good point about implicit environments being a logical extension of the user package DB concept to the new-build model. This does make sense once you get used to it.
which of those GHC features would you consider legitimate features of GHC and which should be removed in your opinion (if any)? And also, who should be charge of creating/modifying those files/folders?
I already gave a detailed analysis of this in my last comment. If you disagree with that analysis or I misunderstand your question, please clarify.
I'll just repeat it to make it extra clear: ghc.environment files are different from other sorts of implicit input because they introduce a new type of non-local state which changes frequently and implicitly (as a side-effect of an invocation whose obvious purpose is not to change ghc behaviour).
The comparison to .git falls flat, because I don't use any tools that non-verbosely, implicitly, as a side effect, change any of the .git implicit inputs. The way I use git, its (implicit) state changes only when I call git or when I manually muck around any relevant files. Or, for the case of CWD, when I use cd/pushd/popd whatever, all of which have a simple, clear purpose that is well documented and really hard to invoke unknowingly.
If you want to save a few keystrokes, please just use an alias that turns "cghc foo bar" into "cabal ghc -- foo bar" or something similar. This serves the same purpose without introducing surprising, hard to debug behaviour.
Otoh, the cases where you want to be explicit is usually when you're scripting because there you are typically in an editor and usually don't mind the extra keystrokes needed, as you write the script once but execute it multiple times and it amortizes.
But you can "script" your commandline experience just like everything else, so this whole argument/differentiation escapes me. I don't like surprises, regardless of context.
@23Skidoo Are logical extensions of supposedly legacy features considered valuable?
Merely saying that it's nothing new if you look at it this way.
I guess it's not new for people who are used to installing stuff in the user package DB. But I think avoiding the user package DB is common practice, and these environment files are imposing the same behavior on people who avoid it.
For what it's worth, I found this issue after running into mysterious nix-based build failures that turned out to be due to the .ghc.environment... files being auto-generated when I went into nix-shell --command 'cabal new-repl'.
For now good-old cabal repl and/or rming the file for nix is fine but it would be nice if this were configurable.
Sure, you can ask everyone to script their environment to have ghc and ghci be some magic scripts which do the DWIM behaviour -- but what does this achieve other than complicating everyone's life? you'd still end up with the commands ghc and ghci doing something special which tooling which naively assumes a different behaviour might likely choke on just as well.
And don't forget, this is about making the default experience on the shell convenient which is traditionally not deterministic; everything you do on the shell is traditionally stateful and obviously $CWD & $PATH sensitive. So to me this is a totally expected behaviour and it is a pragmatic design choice in the interest of DWIM and convenience.
If this breaks tooling it's merely because said tooling didn't take precautions to invoke ghc and ghci in such a way that setups the environment the way it needs it to be; if a tool naively calls ghc and ghci w/o specifying anyting else to shield against configuration files leaking in, in the past it already would have picked up a lot of contextual state which may or may not be in conflict with the tool's assumptions about the environment .
Btw, we just taught cabal new-install how to manipulate the stateful default package environment of GHC (which was a showblocker for making new-build default, as this corresponds to the old user pkg-db in the nix-style paradigm which we were lacking) and this also allows to generate custom $CWD-sensitive package environments. GHC supports these things, and cabal does too; in any case, such files can always be left behind, so it's the tooling's responsibility to become aware of these features and shield/protect itself against if it requires a more pristine ghc environment to operate in.
To summarise, don't require other tooling to create some "safe space" to protect your tool, instead make your tool more resilient/robust against undesired influences by being explicit about what it requires/expects about the environment.
@hvr I feel like we're talking past each other here. Reading that response, it feels like I'm reading the same thing for the umpteenth time; i.e. it does not feel like you have actually addressed any of the arguments made against that position.
This issue really needs to be resolved. I haven't read every comment on this thread but the environment files have broken far more than they have helped me.
I don't see the issue with a cabal command to enter an environment which has the package database this is how the whole python and nix ecosystem works. It is what people expect.
From my perspective is that once you have this automatic magic configuration there is NO option for users to get back to explicit behaviour as many desire. However, given the explicit behaviour, users who want a more automatic experience can use tooling such as direnv in order to automatically enter the correct shells.
To summarise, don't require other tooling to create some "safe space" to protect your tool, instead make your tool more resilient/robust against undesired influences by being explicit about what it requires/expects about the environment.
I will point out that
- The addition of package environment files as input to the compiler behaviour in ghc-8.0 was not mentioned in the release notes[1], nor in the migration guide[2].
- The fact that package environment files have an influence on the semantics of the GHC API is not documented in the API (and similarly nothing was mentioned in the release notes nor the migration guide).
- The fact that package environment files affect the behaviour of GHC is to this date not mentioned in the man-page, nor in the --help output of ghc. It is "documented" in the users guide, if you happen to stumble upon the corresponding chapter[3]. Of course the users guide does not address the GHC API.
- The fact that environment files are produced, as a side-effect, by cabal is not mentioned in the cabal commandline help, nor its man-page, nor its user's guide, nor was the change announced in the changelog [4].
It is not my intention to call out the relevant maintainers for this; it is an understandable oversight.
However, hvr, I strongly suggest you ensure that the the the tools and APIs under your control are explicit about their semantics in their documentation, before making suggestions regarding resilience and robustness to others.
if a tool naively calls ghc and ghci w/o specifying anyting else to shield against configuration files leaking in
Then, hvr, I expect an explanation as to how downstream tools are supposed to "be explicit" in a situation where one tool starts writing to a file and another starts changing its semantics based on that file. For all I can know, the next cabal version might start writing some new stuff into a ".we.love.implicit.inputs" file and the next GHC (API) version might start breaking randomly when it finds that file. There is no way to be proactively resilient against this. What does "being explicit" even translate to, here? Do you mean "rm -rf -- ./.*" to clear "environment files" in CWD? I really am at a loss what you could possibly mean.
And if we forget about the "proactively"/future-compatibility aspect of this, I still simply don't know what I am supposed to do in the current situation. How does one setup the environment appropriately? Should I delete the ".ghc.environment" file? Should I overwrite it in some fashion? Where is this documented? And how do you expect users to know this while it is not documented? And how do you expect users to keep track of which tools change which environment state in what way?
Every user of any interface naturally is "naive", because they don't know the thing as well as the authors of said interface. Indeed, that users (and by extension, the tools written by those users) are naive is a perfect reason for why not breaking expectations and implicit contracts is so important. So thanks for bringing up the phrase.
To summarize: Stop ignoring the users, don't make vague hints at potential solutions, and make sure that you are in a position to tell others what to do.
[1] https://downloads.haskell.org/~ghc/8.0.2/docs/html/users_guide/8.0.1-notes.html
[2] https://ghc.haskell.org/trac/ghc/wiki/Migration/8.0
[3] https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/packages.html#package-environments
[4] https://hackage.haskell.org/package/cabal-install-2.2.0.0/changelog
It is IMO much more unexpected that cabal-install commands alter the behaviour of ostensibly outside programs -- namely GHC and any tooling using it -- by default. @hvr It is more a problem of configuration files unexpectedly leaking out, than of them leaking in.
Just a data point: I found implicit environments convenient in practice, because they made it easier to add out of the box ghcid support to a project I work on. However, they also broke a helper script in the same project.
@23Skidoo Were you able to fix said helper script? What did the solution look like? Is there any related information that might help others running into similar problems?
I just added rm -f .ghc.environment.* to the beginning of the script. When we'll move to GHC 8.6, I'll change it to use -package-env -, as described in https://ghc.haskell.org/trac/ghc/ticket/13753. Alternatively, the new shebang feature in cabal 2.4 could also be used to solve this.
rm -f .ghc.environment.* is unfortunately insufficient in general, as there may be a package environment file in a parent directory (which easily may have been created by cabal-install).
Also,
When we'll move to GHC 8.6 ..
From the linked ticket:
milestone: 8.8.1
[bgamari:] These will not be addressed in GHC 8.6.
It refers to the second part of the ticket. That patch is in 8.6, as you can see here: ghc/ghc@8f3c149.
Another data point:
I use env files a lot and I think they are really useful in a number of situations (I'll probably expand this in a blog post or two), and should definitely get publicized more, since few users know about them.
But, this is not enough to enable them by default IMO. They do cause problems sometimes, and the frustration when discovering the cause slightly outweighs the advantage of knowing they exist.
I think we all agree there should be a flag to enable/disable their generation though.
@23Skidoo I see. Am I correct in that the "master" version of the ghc users-guide is the most up-to-date one? Because it does not appear to reflect this addition. If so, might be worth to ping the relevant maintainer.
Yep, looks like that patch didn't update the manual.
Global variables, side effects and dynamic typing are all really useful in a number of situations.
I kinda like @alanz's idea too. Easy to implement and it'd add another colorful section to my bash prompt \o/
rm -f .ghc.environment.*is unfortunately insufficient in general, as there may be a package environment file in a parent directory (which easily may have been created by cabal-install).
Another thing you can do to ignore the implicit .ghc.environment. file is to use -hide-all-packages -package foo -package bar -package .... This is documented here. In fact, I've just converted my script to do this.
@23Skidoo I see you found the documentation which I already had linked above. It is sad that I need to point out how to work out around bugs in cabal to the very maintainer of cabal.
I have opened https://ghc.haskell.org/trac/ghc/ticket/15513 and https://ghc.haskell.org/trac/ghc/ticket/15541 to track some of the documentation deficiencies.
If anyone wants to open tickets about corresponding documentation of cabal-install, feel free.
Unfortunately, the automatic generation of the .ghc.environment file is the cause of problems for me, as well.
It is really unfortunate that they are being created automatically.
I switch, with nix, between different versions of ghc (and everything below) and this not only breaks the repl, but full builds.
For me, it would be much better, if they are only created if asked explicitly, the current behaviour is way too "surprising" for me.
I just wanted to say that I find the new implicit magic use of environment files by GHC a very bad idea, too. A few people already mentioned Python, so let's take a look how things work there:
The new officially sanctioned way to use Python + virtual environments is pipenv. It generates the actual, transitive list of packages to use (Pipfile.lock) from the description of packages your project needs (Pipfile). If you substitute "environment file" for "Pipfile.lock"and ".cabal file" for "Pipfile", you see the similarities with our problem at hand.
The crucial thing is: Python is blissfully unaware of any virtual environment unless you tell it explicitly that you want to use it. This is IMHO the only sensible way to do this, otherwise hell would break loose, given e.g. the many tools on a normal Ubuntu installation written in Python which would be broken in basically any virtual environment (unless you are extremely lucky). The same breakage happens already when you use runghc / ghci / tools using the GHC API.
If you want to use your Python virtual environment, you can do:
- Run
pipenv shell, which will search upwards in the directory hierarchy for a virtual environment and will spawn a new shell in which Python will use that virtual env. - Run
pipenv run foobar, which is basically the same, but just for a single commandfoobar. - Use
direnvplus a suitable simple.envrcto switch into/out of the virtual env when you enter/leave the corresponding directory. - etc. etc.
The main point is: If you want to pick up an virtual environment, you have to be explicit, and you have a choice of dead easy, well-documented, and widely known alternatives to do so, depending on your workflow. This works extremely well in practice, including tooling in editors etc. I totally fail to see why GHC/cabal stubbornly ignore the painful lessons learned in the Python world and repeat the same mistakes... π
@choener It isn't clear to me how switching between ghc versions with nix could interact poorly with env files? In particular, as per https://downloads.haskell.org/~ghc/master/users-guide/packages.html#package-environments the searchpath for env files is indexed by ghc version, so different versions of ghc shouldn't end up sharing the same env files?
@gbaz , sorry for the confusion; that is cabal not ghc. I've been playing with different versions of things lately.
Unfortunately for me that increases the likelihood that I'm hit by this, since I have many local derivations using nixos packageSourceOverrides.
Ok, so how does switching between different cabal versions cause a problem with this then? :-)
The problem is, that once the environment file has been created by a sufficiently new cabal, it interferes with building as setup by packageSourceOverrides.
@svenpanne has given a beautiful explanation, why having these files is not good, and I see the fallout from their creation in local builds with local dependencies.
One other thought, occasioned by this blog post's (https://hexagoxel.de/postsforpublish/posts/2018-08-25-ghc-pkg-env-misfeature.html) point that:
"Invalid and even just outdated package environment files make ghc abort."
Perhaps we could change ghc such that if there is an error occasioned by an invalid command in the course of processing and env file then ghc could specifically point to the env file as inducing the problem? (Or did we patch ghc to do this already and I forgot?).
A big problem with the behavior seems to be just that when it goes wrong people get confused, so I suspect that better error messages will go a long way.
@choener ok, so there isn't a distinct problem switching between ghcs or cabals at all, just the known complaint about interactions when using a sufficiently new cabal such that env files are generated at all. I'm not trying to be dismissive, I just want to be sure that people understand what the issues are or are not.
I really do think that a combination of a flag which nix haskell integration can set to turn off env generation combined with better error reporting as I suggested above should make this workflow much improved.
@gbaz Yes!
I wanted to give another voice towards making this opt-in.
In particular since it seems, that currently you can not even opt-out (as far as I understand the ProjectOrchestration.hs).
This in turn makes running cabal-install 2.2 rather frustrating, since I regularly break large builds just by doing cabal new-repl in a dependent package.
I assume that this is partly due to the way how packageSourceOverrides works in nixos, and partially due to the way how I develop -- since I "never" use ghci, but only cabal new-repl and friends.
I agree exactly with @svenpanne's clear comment #4542 (comment)
Especially:
I totally fail to see why GHC/cabal stubbornly ignore the painful lessons learned in the Python world and repeat the same mistakes...
Any discussion of how nix works is not really relevant, it is just an example of the files breaking things. Of course it can be patched to ignore these files, and of course the GHC API can be changed to fix @lspitzner's problems but they do not fix the root cause of the issue.
@gbaz There is https://phabricator.haskell.org/D4689, included in 8.6.
From my perspective, this is no solution, on the contrary: It is another effect added to an operation that should be pure. Might be acceptable for UI, but for API it is really sad that you call some function and it spams to the host process's stderr.
@gbaz There is https://phabricator.haskell.org/D4689, included in 8.6.
Does that solution make the breakage obvious (with default options) when using plain ghc? That was the situation in which I was bitten by this issue some days ago.
I think there are three factors that contribute to making this a misfeature: some tools create the .ghc.environment file automatically and (I think) silently, other tools automatically pick it up, and the file is hidden(!). Removing any of these factors would help the situation, but removing all of them would be best, I think. cabal ghc or cabal ghci is not so much more typing, and is much clearer to me.
Just chiming in as yet another person who wasted about half a day on mysterious breakage until I found this. I understand why it's convenient, but I agree with the others in this thread that subtly breaking users using "normal" compiler workflows is pretty bad.
(I tried to write this comment politely, but I feel like I should mention that this experience has actually made me very cross)
@michaelpj your comment makes me think... Is it worth noting that, without trying to measure the ratio of positive / negative reactions to this feature, the magnitude of any negative reaction has consistently seemed much larger than the magnitude of any positive reaction? i.e. this feature seems either mildly convenient or massively frustrating, depending on who you ask. Regardless of whether you like the feature, are we willing to risk such frustration for the sake of minor convenience, just from a user-friendliness standpoint?
The problem is that the comments here are people that have bitten by the rough edges. Discussion on an issue tracker is always dominated by people who have encountered issues prompting action -- there's no way to make any sort of judgment on that. We just need to fix the known and real outstanding issues, sooner rather than later, and then we can assess what problems (if any) remain.
We just need to fix the known and real outstanding issues, sooner rather than later, and then we can assess what problems (if any) remain.
Where I come from, when a feature is rolled out and causes breakage to a large number of users the response is a quick revert or gating it behind a flag until you can make a safe rollout plan. Somehow "just fixing the issues" always happens later rather than sooner.
(I also agree with the arguments in this thread that the feature seems strange to me. However, I don't mind that so long as I am not broken.)
I guess I'll come back to new-build in another six months.
Indeed new-build is still experimental and only with the recent completion of the last gsoc is it nearing feature-complete. As with any open source project, patches and contributions are very welcome, but we're getting there with all this stuff, slowly but surely.
@gbaz There's always going to be unknown information; we can't let that just completely stall the decision making process. This issue has been in a limbo state (read: de facto siding with current behavior) for quite some time now. We've seen several people come directly to this thread with the intention to display their support (edit: support of the feature, that is). I think that's at least something to go on.
@ElvishJerricco I don't understand why you think this is in a limbo state. There is no endorsement of the current behavior. There is a change to two elements in GHC already -- improving warning messages to indicate use of env files and adding an opt-out flag to GHC directly (https://ghc.haskell.org/trac/ghc/ticket/13753). I think that ghc messages could be beefed up further in the case of errors -- but that really should be tracked in the GHC tracker.
Meanwhile, this issue tracks the ability to configure generation of the files or not -- that feature has not been implemented, and needs to be implemented. That's not limbo. That's "awaiting implementation."
The only dispute/question is if the better default behavior is to generate these files or not. Which is necessarily a question of choosing between use-cases. Since I think that the main issue with these files is in the presence of nix, and since nix already overrides many cabal settings, I think it is fine to have nix override the setting here as well.
(Yes there is also an issue with other tooling and the GHC api -- but there is a GHC tracker ticket for that too, as far as I know)
In any case, I think we've spent way more time discussing this improvement than making it, and I guess I'll try to cook up a patch in the next week or so, given that nobody who has chimed in on this thread with strong opinions has offered to make any patch themselves.
Since I think that the main issue with these files is in the presence of nix, and since nix already overrides many cabal settings, I think it is fine to have nix override the setting here as well.
As I've stated before, Nix does not require any cabal settings to use cabal with a nix-shell-provided GHC-with-packages. But now, because of this feature, it does, and that means using CABAL_CONFIG to override stuff. But that's not a general solution, as it prevents users from getting settings that they actually do want from ~/.cabal/config. So they actually have to have a project-local setting, which means Nix can't make this the default behavior.
Furthermore, I don't think Nix is the main reason whatsoever. Sure, it's the main reason I discovered the issue, but the principle of it alone bothers me, and there have been reports from several other tools in several other use cases that this breaks. Sure, maybe there's an argument that the GHC API just needs a flag for it or something, but as long as this is the default behavior, people will frequently find themselves experiencing this kind of breakage. It is a perplexing situation when you encounter these problems for the first time, and it requires every tool author to have experienced that perplexity and anticipate it, or else their tool is likely to be broken.
That said, you're right about the limbo state. I apologize for disregarding the work that was done on GHC; that was not my intention. I was attempting to address specifically the decision about what the default behavior should be (edit: and whether that default would be changed in cabal or in GHC; about which I'm indifferent)
I think the main dispute is about whether or not GHC should pick up environment files by default. If they are generated by cabal or not is much less important. And the GHC ticket mentioned above is not the right way to fix the current state of affairs.
I'm very disappointed how this misfeature crept into GHC without any real discussion. The strange thing is: Tiny details are usually discussed on an epic scale, but something which heavily affects tons of people just went in. And yes, 99% of this discussion should be on the GHC issue tracker or on some mailing list...
@svenpanne A discussion on glasgow-haskell-users sounds like a good idea. These files exist to be generated, and if cabal has this or that default setting, it hardly matters. If any tooling takes advantage of this ghc feature, similar issues will arise, and I think the main improvements to be made to make them work well will all tend to be on the GHC side.
@ElvishJerricco I'll try to sort out some of the questions I have about how the nix stuff works with configs in chat so as not to further clutter the thread.
Hmm... actually, now that ghc's use of env files can be turned off by an env variable (by the ticket I linked) then nix can just set that and ignore messing with the config for this issue (though it still needs to mess with it in the new-build case for some other stuff, I think, at least for the time being).
What if this config setting was made ternary instead of binary, enabling the feature by default only for versions of GHC that have a warning about env files and the -pkg-env - patch? I don't like the inconsistency, but on the other hand env files are not supported by all GHC versions anyway.
@23Skidoo As maintainer, would you be fine with merging a PR that reverts this feature (i.e. that removes the code that creates package environment files?)
This is in spirit of what @gbaz mentioned:
We just need to fix the known and real outstanding issues, sooner rather than later, and then we can assess what problems (if any) remain.
And indeed, if any people are still in favour of this feature, then by no means, they should be free to:
- implement it in a way that does not break users, or
- convince, in a public discussion, a majority of the respected members of this community that there is a sufficient advantage to breaking users, and that there is no non-breaking alternative to reaching this advantage.
But this can happen after fixing the real, outstanding issue, correct? Any discussion of how the config setting might look like, similarly, could be done afterwards.
I would like to point out that cabal officially supports ghc-8.0 through 8.4 (and a bit more, probably). This means that any "fixes" to this issue that might make it into ghc-8.6 are irrelevant.
No, we shouldn't remove it, but we should definitely make it configurable.
Also, to make this clear: I specifically did not say "remove the feature". I said "revert it until the ones in favour implement it in an acceptable manner". Glancing over the difference is an unfair move.
