aiidateam/aiida-core

verdi code setup does not check the remote absolute path

Closed this issue · 10 comments

When you setup a code through verdi code setup and (accidentally) provide an incorrect path to the remote binary, the command will not complain and successfully create the code. If you then try to use the code for a calculation the calculation will fail. You can now also no longer delete the code or correct it, as it now has an associated calculation and so you are forced to create a new code node, choosing a different label, because the original is now already occupied.

Since when specifying the remote absolute path, the computer is already selected, we should be able to verify whether the given path even exists and put out a warning otherwise. I don't see any impossibilities or blocking things to this approach but maybe I am missing something

I guess it should still be possible to set up codes if you don't have an internet connection, i.e. some logic is needed as well.

@sphuber I think you mentioned in #3319 that you had second thoughts about this feature request - perhaps it should be limited to setting up local codes?

Also, it would create a bit of a discrepancy in the sense that you can have local codes in your database (imported from somewhere) that don't correspond to an actual binary, while you can't set such a code up.

I guess the right way forward would be to do the check by default (even for codes on remote computers) but offer a way to disable it (e.g. via an interactive prompt that defaults to "proceed").

I definitely have second thoughts about doing this for remote codes, because adding the complexity with the transport is not worth what we will gain I think. If that transport is local though, so "remote code on localhost" it might still be useful. My original request came from the situation where I configured a code on localhost and accidentally mistyped the path. The code was then unusable and since we basically also do not provide a way to edit a code, I basically had to delete and recreate it. At the time there was not even a simple way of deleting it I think. So my thinking then was that it should be nice to prevent this annoyance by checking the path.

So maybe we should just check for local codes and remote code on localhosts?

So maybe we should just check for local codes and remote code on localhosts?

Yes, I think that is a good next step (and, as mentioned above, this check should happen during creation of the Code only - it's ok to have (imported) codes in the DB that point to non-existing paths).

P.S. Here, "on localhosts" should mean: on computers with "local" transports.
Since the computer is prompted for before the executable paths, this is possible via the click context - we just need to make sure to validate these

@options_code.REMOTE_ABS_PATH()
@options_code.FOLDER()
@options_code.REL_PATH()

if we're setting up a code on a computer with local transport.

Alternatively, this validation could be left to the Code constructor - or both!

@sphuber What would you prefer?

I'm randomly getting the chance to read this....and it's been a while that I think that would be great to have a command verdi code configure (similar to the computer one). The scope would not only be to test the code executable, but ideally could be extended by plugin developers in order to parse the version of the code for instance. To have an extra attached to the code and reporting the version would make the parser development much easier.
Just a dream though, I guess it's very difficult to implement.

@bosonie There is an old issue with a similar suggestion here #420
Note that the code version is considered "provenance-type information", i.e. it would belong into verdi code setup (verdi computer configure is for "private" connection details).

As for parsing the version number automatically: I'm not sure it's worth investing time into this, since hopefully we can soon support containers which will encapsulate not only the version but the environment as well.

I am coming back to this issue after a discussion with @louisponet who ran into the same issue of having made a typo in the code setup and having the subsequent calculation fail. Maybe instead of checking on code setup, we could check when the code is actually being used in a calcjob. As a naive implementation we could add a validator to the CalcJob instantiation that will check the inputs for a Code and perform the necessary checks. The downside is that this will add significant overhead that is essentially pointless after the code has been validated once. Since it is immutable, if it is valid once, it should be valid always (unless of course the code on the remote actually changes, such as the binary being moved or deleted). Next step would be to add some flag to the node (maybe a private extra) that marks it as valid, in which case the validation can be skipped. The question still remains what to exactly check and if it is possible to prevent most common failures.

To prevent the additional complexity of validation on the fly, maybe as a in-the-middle-of-the-road solution, we could implement verdi code validate/check that in a similar vein to verdi computer test performs some checks. Maybe this would already help catch a number of cases where people make a mistake in configuration?

+1 to have a verdi code test (I would avoid running the validation on verdi code setup as this will require opening a SSH connection, and I think it should be possible to setup a code even if I'm offline).

yes doing this in verdi code setup was already rejected for this very reason in the early parts of the discussion