sphuber/aiida-shell

Allow for relative paths in the code (or adapt prepare_code)

giovannipizzi opened this issue · 4 comments

Currently, prepare_code will reuse the code if a code with the same label and computer is found.
However, I have the following usecase.

  1. I run prepare_code("python") as I want to use it, e.g. create a virtual env. It will create a code and put the absolute path (I think this is the current behaviour?)
  2. I create a virtualenv
  3. I want now to run from inside the virtualenv. I'd like to just submit a ShellJob specifying
     metadata = {'options': 
         {
             'prepend_text': f'source {VENV_NAME}/bin/activate',
         }
    
    and python as an executable, so it finds the one in the path; but since there is already a code with that name, it will reuse it, but it will then put the "wrong" absolute path, obtained in step 1.

Of course I could create a new python code with the correct abs path, but this can become quickly annoying if I'm creating many venvs on the fly. Is there a way to specify that we want a relative path? (maybe it's currently a limitation in AiiDA?) Or to configure a new code with the same name?

Thanks for the report @giovannipizzi . The use-case of the relative path I just added last week in #73 , I just haven't released it yet since I merged it yesterday and am waiting for #70 as well. Should be releasing that soon.

That leaves the question how we can easily have multiple codes that reference the python executable but load a different virtual env. Currently, uniqueness is determined by the full label executable@computer-label. The prepare_code is just a utility function used by the launch_shell_job wrapper to automate creating a Code node. But this is all just a wrapper around a the ShellJob. You can still manually create a Code node and pass that to launch_shell_job directly.

I think for this use case, the user should simply create the desired code manually. We could add an option to prepare_code to customize the prepend_text, but then why just that and not any of the other computer/code properties? And how would the label then be determined? Just the command name as label will no longer suffice to make it unique. A random suffix? But that will make it difficult to recognize. If you have clever ideas for the UX I'd gladly consider them.

Thanks! I think that #73 solves my use case! I'll continue testing and let you know if it indeed works.

I'd like to reopen this. Still in the current code I realize that even if you pass resolve_command, if a Code with the same name already was created, that will just be used.

I tried to adapt the code as follows:

def prepare_code(command: str, computer: Computer | None = None, resolve_command: bool = True) -> AbstractCode:
    """Prepare a code for the given command and computer.

    This will automatically prepare the computer.

    :param command: The command that the code should represent. Can be the relative executable name or absolute path.
    :param computer: The computer on which the command should be run. If not defined the localhost will be used.
    :param resolve_command: Whether to resolve the command to the absolute path of the executable. If set to ``True``,
        the ``which`` command is executed on the target computer to attempt and determine the absolute path. Otherwise,
        the command is set as the ``filepath_executable`` attribute of the created ``AbstractCode`` instance.
    :return: A :class:`aiida.orm.nodes.code.abstract.AbstractCode` instance.
    :raises ValueError: If ``resolve_command=True`` and the code fails to determine the absolute path of the command.
    """
    computer = prepare_computer(computer)
    resolve_string = "" if resolve_command else "-relativepath"
    code_label = f'{command}{resolve_string}'
    full_code_label = f'{code_label}@{computer.label}'

    try:
        code: AbstractCode = load_code(full_code_label)
        resolve_idx = 0
        while code.get_executable().is_absolute() != resolve_command:
            resolve_idx += 1
            code_label = f'{command}{resolve_string}-{resolve_idx}'
            old_full_code_label = full_code_label
            full_code_label = f'{code_label}@{computer.label}'
            LOGGER.info(
                'Code `%s` has %s path instead of %s, trying a new name %s.', 
                full_code_label,
                'absolute' if code.get_executable().is_absolute() else 'relative',
                'absolute' if resolve_command else 'relative',
                old_full_code_label
            )
            code: AbstractCode = load_code(full_code_label)

    except exceptions.NotExistent as exception:
        LOGGER.info('No code exists yet for `%s`, creating it now.', code_label)

        if resolve_command:
            with computer.get_transport() as transport:
                status, stdout, stderr = transport.exec_command_wait(f'which {command}')
                executable = stdout.strip()

                if status != 0:
                    raise ValueError(
                        f'failed to determine the absolute path of the command on the computer: {stderr}'
                    ) from exception
        else:
            executable = command

        code = ShellCode(  # type: ignore[assignment]
            label=code_label, computer=computer, filepath_executable=executable, default_calc_job_plugin='core.shell'
        ).store()

    return code

I'm not making a PR as there are some outstanding questions.

  1. is it OK to have -relativepath` appended?
  2. Is it OK to append -1, -2, ... to make codes unique?
  3. now I changed the logic - the name essentially will never be the abspath if you do not specify the abspath yourself. But when doing this, I realized that with the current implementation in the main branch, you could have the following usecase:
    • you first call prepare_code('/bin/bash'); /bin/bash@localhost is created
    • you then call prepare_code('bash'). If the bash code does not exist, it will try to create it, but since by default resolve_command is True, it will eventually create another /bin/bash@localhost code, i.e. two codes with the same label
      I don't think (as I think you say) that there is a robust way to do it without opening a connection via the transport every time prepare_code is called. Therefore, I think it's best not to use /bin/bash as a path, if the user just wrote bash; better to call it bash-1. And we call it /bin/bash only if the user passed that as the parameter. This is essentially how the code above works.

If you think this is OK, feel free to adapt and reuse the code above.
Also, in this case, I would add one more test if the code is found, to ensure that the code.get_executable() is indeed what you would have set. If resolve_command is False, this is easy.

If it's False, I guess this is complex, if the user passed just the binary name and you want really to check if it's the same executable. Maybe this is something to tell users - if you care about having one executable, pass its full path. If you pass just the binary name, it will either try to get the full path, or reuse a code that "makes sense" and is already available on your AiiDA installation. One could still add an optional flag "force_check" but maybe it starts becoming too complex?

One could still add an optional flag "force_check" but maybe it starts becoming too complex?

This was exactly my point in my previous comment. Here you are just trying to deal with the different scenarios of absolute vs relative paths and already it is getting (too) complex. And this doesn't even start to take care of codes that require specific prepend texts or append texts, such as loading virtual environments.

The design of aiida-shell intentionally reuses the CalcJob interface and provides the ShellJob implementation to give full access to the normal interface. The launch_shell_job is just a simple wrapper that makes getting started even easier by not having to create the code. The main focus here is for new users. As soon as there are more experienced users with more specific requirements, they can change to directly use the ShellJob interface. Or even still use launch_shell_job but create the code themselves, because launch_shell_job accepts a Code instance for the command.

I am currently not convinced that adding all this complexity is the right way to go. Instead, maybe we can just improve the documentation a bit on using custom codes, e.g. here: https://aiida-shell.readthedocs.io/en/latest/howto.html#defining-a-pre-configured-code I could add a snippet of how to actually setup such as code, or point to the aiida-core docs.