Fallback to previous devshell upon failure is unclear

Question

Fallback to previous devshell upon failure is unclear

amarshall opened this issue 7 months ago · 6 comments

Since #420 (I believe), if loading the Nix devshell fails, the previous cached one is used. While I do not necessarily disagree with this behavior, I think that it should be much more clear that this is occuring, as currently one might see errors but think “well it loaded anyway, so those errors may not matter”, especially coupled with the fact that Nix does have non-fatal errors. Or the user might just miss those errors in a sea of other output or because they are in the flow. Either way, the shell ends up non-obviously in the “stale” state.

It may also make sense for it to be configurable whether to do this (and knowable from the outside, e.g. to condition on fallback having occurred in the .envrc).

Answer 1 · 2024-04-17T14:06:18.000Z

I don't support configuring this behavior. We can discuss why if you want, but it boils down to "complex bash is hard to read and configuration increases code paths significantly". I don't really see the benefit of making this configurable if we make it obvious
I think that this is just adding an else block to the if surrounding the print-dev-env call here? I may get around to this soon, but PRs are definitely welcome.

Answer 2 · 2024-04-17T14:36:54.000Z

The current error message says "use_flake failed - Is your flake's devShell working?", but it doesn't say that it falls back to the previous devshell, which is what I think @amarshall is trying to say. I think the error message can be improved here.

Answer 3 · 2024-04-17T14:42:07.000Z

That error message only happens if there is no profile_rc to fall back to, which means that there's no fallback shell at all.

The problem is more nefarious than just that, since that failure mode is pretty clear: We couldn't evaluate your flake at all and so we failed with an error message. If we failed to evaluate your current flake, but there is a currently existing profile_rc, you'll never see that message. This means you'll implicitly end up in an old shell.

I agree that we can improve on this, but that isn't the source of the issue (and this points at why I consistently push back against additional configuration - this is complex enough that even the two people that should know the code the best get lost in it even now!)

Answer 4 · 2024-04-17T14:45:59.000Z

I agree that configuration should be a last resort. How do you feel about:

Adding a failure message in a new else as you mention
Setting some env var in the else as well (e.g. NIX_DIRENV_DID_FALLBACK or something?) to make the behavior extensible, rather than configurable. E.g. the user could [[ -n ${NIX_DIRENV_DID_FALLBACK+x} ]] && exit 1 or add additional messaging or consume in their prompt config

For some more use-case based on my experience since upgrading nix-direnv in all our projects:

Even if I kind of understand that fallback has occurred, I might step away from that shell for a while and not remember that it is in a “wrong” state as any error is far above in the scrollback out-of-view.
This is compounded by the fact that some scenarios result in transient errors in Nix itself (e.g. unavailable substituters or remote builders) such that the watch_files won’t actually change when the issue is resolved, requiring manual direnv reload that may be overlooked especially if there are several shells open for the same project.

The current error message says "use_flake failed - Is your flake's devShell working?"

I’ve never seen this error. Using nix-direnv 3.0.4, direnv 2.32.3. I just changed a flake.nix to have a syntax error and it did not show that error upon reload.

Answer 5 · 2024-04-17T14:49:34.000Z

I’ve never seen this error. Using nix-direnv 3.0.4, direnv 2.32.3. I just changed a flake.nix to have a syntax error and it did not show that error upon reload.

Agreed there should be definitely an error displayed every time and we shouldn't cache it. If this error message happen, why wouldn't than want to have the fallback to be disabled?

Answer 6 · 2024-04-17T14:59:48.000Z

I’ve never seen this error. Using nix-direnv 3.0.4, direnv 2.32.3. I just changed a flake.nix to have a syntax error and it did not show that error upon reload.

That's because your flake has a fallback profile already :)

This is the behavior we need to clarify. Right now, we wait to invalidate the old shell until a new one loads properly. If it doesn't load properly, we silently fall into the outdated environment. If we don't have an outdated environment to fall into, we can't fall into any new environment and that is where the "is your flake's devShell working?" message comes from.

I think this particular case is pretty clear. "We tried to do a thing and couldn't do it and we couldn't fall back to the old thing because it didn't exist". The ambiguous case you're asking for a resolution to is "We tried to do the thing and couldn't, but we did find that we had an old iteration. We loaded that iteration but didn't tell you anything about doing that other than the nix messages that got spit out above".

Setting some env var in the else as well

Sure. That's both trivial to set up and trivial to consume in a user specific manner (not sure how one might handle that in a "global" way - say, in ~/.config/direnv/direnvrc or something, though? But that doesn't mean we shouldn't do it)