commercialhaskell/stack

Is it expected that data files do not get installed to where bin is?

Opened this issue · 13 comments

For example, hackage happy, which contains some resource files like HappyTemplate-arrays-coerce in share folder.

Now that stack installs the executable happy in ~/.local/bin, but leaves the resources in ~/.stack/snapshots. If I remove the snapshots to release the disk space, happy breaks.

Wondering if there can be a solution.

Yes, this is a known issue — the same problem affects data files (e.g. with pandoc). I intend to add this to the wishlist — a brief summary description would be helpful.

I suspect that the solution to this is related to #4243, which should also be added to the wishlist. Do let me know what you think. And thanks for raising the issue!

(Edit: I suspect there may be a workaround — if you're interested I could ask around.)

#848 is very much related to this.

I've added that to the wishlist.

It's by far our very oldest P1: Must issue. I wonder if the new configure-options could be used for a workaround, since that may in theory be able to pass --prefix through to Cabal. No idea what the effect would be though, it might also break everything :)

I couldn't help but try this. --prefix doesn't work, but --datadir does. Here's what I did (using the latest stack-2.1 RC):

  • Unpacked the happy source code to /Users/manny/tmp/happy-1.19.11
  • Ran stack init to generate a default stack.yaml
  • Appended to stack.yaml:
configure-options:
  happy:
  - --datadir=/Users/manny/tmp/happy-data
  • Ran stack install
  • rm -rf /Users/manny/tmp/happy-1.19.11 (removed whole directory where I unpacked and built happy)
  • Confirmed that ~/.local/bin/happy (installed by stack install) still worked

Following the above, happy still works even though I removed the build directory (as long as I don't remove /Users/manny/tmp/happy-data).

When I did the same except skipped the configure-options, then happy failed with "happy: /Users/manny/tmp/happy-1.19.11/HappyTemplate: openFile: does not exist (No such file or directory)".

I suspect that the solution to this is related to #4243

I don't think so. All this would do is move the snapshot cache to a different location, but removing the snapshot cache would have the same effect as it does now.

Part of the reason --prefix has taken forever is that every potential solution we come up with seems to end up with some weird edge case that makes it extra difficult. But I wonder if we can do something simpler to handle this common case: an executable we want to install has some data files.

So here's my thought: what if stack install (actually just a synonym for stack build --copy-bins) checks whether the target package has any data files, and only in that case rebuilds it with a --datadir somewhere under ~/.local/share (or wherever XDG/macOS/Windows says)? Put another way, --copy-bins would imply something like configure-options: {package: '--datadir=~/.local/share/…/package-1.2.3/'} if the package has data files. I believe this would handle the most common use cases.

One case this wouldn't cover is if one of the dependencies of the executable has data files that the executable needs. Handling that is more complicated and time consuming because it could mean having to rebuild a lot of dependencies as well. However, I can't think of any programs off the top of my head where this would actually be a required.

@snoyberg what do you think of my simplified approach above? I think it may resolve the most common pitfall people run into.

rebuilds

What would it take to avoid that (potentially expensive) rebuild? Could we at least avoid the first build if you call stack install directly?

  • Can one use the correct prefix in the first place? If not, what about allowing a separate configure step to pick which options to use?

Just a thought, if we cannot figure out a perfect solution soon, why not trying to use a stack bin folder which is managed (by stack), instead of installing to any unmanaged bin folder?

The mechanism can be really simple. Install binaries into, for example ~/.stack/snapshots/happy/bin, then link into ~/.local/bin. So stack can check if the "command" is installed by stack. And if I removed the snapshot, the link would be dead with clear messages instead of something unpredictable (tools may or may not prompt for missing resource files, or just crash).

@borsboom I think an approach like that could work.

What would it take to avoid that (potentially expensive) rebuild? Could we at least avoid the first build if you call stack install directly?

I believe this approach would, indeed, avoid the first build if you initially called stack install (or stack build --copy-bins). If you later ran stack build (without --copy-bins) it would rebuild, though.

@Magicloud, your notes here are similar to what I've posted about in https://discourse.haskell.org/t/how-to-fix-package-breakage-due-to-file-corruption-stack/2366, so maybe I've stepped on the same issue you have with happy?

I read through the comments here, and maybe I missed it, but I'm not really following: is there a short-term work around for this while the more ideal --prefix/ --data-dir params are worked out? Or what is a user supposed to do if they run into this?

cc @borsboom

@ketzacoatl I think it is.