LLNL/uberenv

Support Spack's "--config-scope"

estebanpauli opened this issue · 19 comments

Currently, uberenv copies the spack configuration files for the appropriate platform from the user's source repo. This works well, so long as the needed tools are in standard locations. When this is not the case, users can use the --upstream option to chain spack installations (https://spack.readthedocs.io/en/latest/chain.html). This allows the user to specify extra packages that have been built by an external version of spack. However, there is no way to add extra packages that are in custom locations on a user's machine. For example, on my mac, the pre-installed cmake and doxygen are older than what I need. Rather than having spack build these, I downloaded them and put in /Users//Applications/. It would be nice if I can tell spack about these while still using a project-specific scripts/spack/configs/darwin/.

Proposal: add a --config-scope option to leverage spack's equivalent feature (https://spack.readthedocs.io/en/latest/configuration.html#custom-scopes) just like --upstream is currently supported.

With the above, a user could easily have a separate configuration with tools that were installed manually rather than through spack.

As an additional option, uberenv could also support --use-user-config to use the configuration found at ~/.spack. This would be equivalent to --config-scope ~/.spack/<platform>. This would be convenient for users who user their user-specific configuration files to just keep track of custom tools that are not affected by the compiler (e.g. perl, python, cmake, doxygen, flex, bison, etc).

relevant: I am pretty sure user scopes are going away in spack

If you have a packages.yaml config (even in ~/.spack config) why can't you hand that dir to uberenv using the existing logic?

@cyrush because in his case, the package.yaml is not portable but specific to his machine.

relevant: I am pretty sure user scopes are going away in spack

If you have a packages.yaml config (even in ~/.spack config) why can't you hand that dir to uberenv using the existing logic?

I'm hoping to have a sharable scripts/spack/configs/darwin/package.py in my source repo that I use for most machines. However, I want to be able to supplement that with the paths to cmake, doxygen, and other tools that weren't built by spack and are not in standard locations. It's essentially the same use case as for supporting --upstream, but for manually-installed software instead.

Can environments help with this? or possibly handle this on the uberenv side where you combine two packages.yaml? We run into a similiar thing with devtools on axom/serac where the it would be helpful if you could have a base file that you append to in some manner.

as @white238 mentioned --- I think exploring composing packages.yaml might provide the convenience you are looking for.

Using a combo of an upstream and user config seems like a pretty fragile road to travel.

this works well, so long as the needed tools are in standard locations

There is a solution now: check in per-host host spack configs -- like (esteban's machine) and revision control those.
Maybe you want to avoid duplication? If so composing packages.yaml seems like a good thing to invest in.

My initial thought was to use --upstream for this. Indeed, the goal is to share already installed configurations. So even if it is externally installed I thought that if it is installed in an upstream spack, then we could use it.

So I created a Spack instance and registered my external CMake install in it.
Then I declared CMake as an buildable: false package in Serac darwin config.
Using --upstream to point to the first Spack instance, Uberenv creates a local instance, and if I run spack find cmake in it, it detects the external installation of CMake!

However, when I try to effectively install something with it, Spack fails to find it... This incoherence looks like a bug.

The more duct tape involved, the less chance there is for a robust success :-(

Well, not easy to design something both robust and flexible when the context is "a developer machine".

Anyways, --upstream behavior is not a bug. Local instance expect an external install, so it does not look for the one defined in the upstream...

Like @cyrush said, being able to compose packages.yaml files seems like a good option. That's what I was hoping to get via --config-scope. If there's a better way, that would be great. Looking at what goes in a configuration (https://spack.readthedocs.io/en/latest/configuration.html), compilers.yaml also looks like something I might want to stack. Really, I can see a case for all of this. It kind of goes against what uberenv is trying to accomplish, but at the same time, it helps accomplish it by letting you mostly share the configuration in a single place and just put a few machine-specific things in a different place when needed.

@estebanpauli, config-scope is a command line argument to a spack call. As a consequence, it means uberenv would have to add this anywhere relevant in the script, but more problematic, once outside uberenv using the local instance of spack without this argument would result in a different behavior, if not fail.

That's why merging the packages.yaml is a more "uberenv-ready" solution. However, I think it would not be easy to make it robust, and it would basically consist in re-implementing the config-scopes.

At first glance, merging packages.yaml files shouldn't be that difficult -- read them in the order specified, adding the contents of each one to a dictionary, while replacing existing keys. If you limit the merging to the top level, it should be easy. If you want to merge at lower levels, then I agree that it becomes more complicated. However, I would think that as a user, you would just want the merge to happen at the top level. That should handle the most common use cases of overriding the contents of the default packages.py. Anything beyond that would be harder to reason about and remember the rules for as a user.

@estebanpauli, config-scope is a command line argument to a spack call. As a consequence, it means uberenv would have to add this anywhere relevant in the script, but more problematic, once outside uberenv using the local instance of spack without this argument would result in a different behavior, if not fail.

I hadn't thought about the user case of making later calls to spack. That would present a serious complication.

Maybe we could simply allow not to remove the user scope of Spack. E.g. an --unsafe option. I know it does not solve @white238 use case.

Since we are patching Spack, we could as well patch it to point the user config to user-defined-scope instead of ~/.spack.

OK, maybe add this to Uberenv:
spack --config-scope <my-config-scope> config get <section>
That would flush the merged configuration, and then we can use the result as the only configuration in the local spack. @white238 what to you think?

I was wondering if were could leverage Spack's environment files ability to include files, variables, and the "when" logic to solve this.

For a completely untested and contrived example:

  definitions:
  - cmake_path: /path/to/specific/cmake
    when: env.get("MACHINE", "") == "estebansMachine"
  - cmake_path: cmake

packages:
  cmake:
    buildable: false
    path: $cmake_path

@white238 -- if folks can pave the way, that sounds awesome