Deprecating -b BIN_DIR
BoPeng opened this issue · 6 comments
Right now we have an almost hidden feature that
~/.sos/bin
would be before anything-b PATH
would be preprended to$PATH
.
This was designed to specify which command to use when there are multiple versions, but it would be a lot less flexible and useful to do
sos run workflow -b /path/to/R3.3
than
sos run -r host_p1
when host_r3.3
is defined with
module load R/3.3
sos run ....
since module load
can do a lot more than setting $PATH
.
Wel ... i use -b
for many of my pipelines on my desktop where I keep the particular executable in a particularly analysis separately, instead of using more formal approaches such as conda env
(or your example, module load). I think it still have its appeals. However I do also agree that users can always do export PATH=BIN_DIR:$PATH
. So i dont have a very strong opinion on this matter. I dont use ~/.sos/bin
though.
The extended -r host
feature that is being implemented in the 1319 branch allows you to predefine a number of "running environments" for your pipeline, which includes but not limited to conda activate
, export PATH
, module load
, and allows for the more ambitious plan for vatlab/sos-notebook#262 where a sandbox could be created before sos run
, and do something following the execution of sos run
. It is a much more flexible option than -b
so I tend to deprecate -b
.
This is nice -- sure I've got no issue to get rid of -b
.
While we are on it, shall we also allow for step specific running environments, or we have this mechanism already? For example using some templates we configure a step to run something like conda activate
, export PATH
and module load
... This is also something I experience in benchmark applications where for example I have different git commits of the same software to compare performance against each other, from different conda
environments. I would like to do them dynamically from configuration files so I dont have to touch my workflow script. Dockerizing them would not work for cluster ...
There are several flavors of this
-
A multi-kernel online notebook (myjournal.com?) allows beyond the specification of "kernels" for each cell, to "environments", which are seemingly docker images. it is a nice extension to SoS Notebook's multi-kernel approach (which I do not know how to achieve under the current sos notebook framework).
-
However, even if we cannot do 1 under sos notebook, we could possibly do it under SoS, with something like
[step: env=refer_to_host_template]
- A "wrapper" script for actions, conceptually like
[step]
sh:
module load ... R3.3
R:
script
that let R executed inside the environment created by sh
is not possible, but we could allow
R: env=conf_template
where conf_template
has the same syntax of job_template
etc.
- For a similar issue we have options such as
task: prepend_path
which I really do not like but sometimes need it. It could be unified to task: env=conf
.
The bottom line is that what you are proposing is something I had in mind for a while but have not figured out the best approach yet, and it all these could be unified under a template approach that is being implemented for -r host
.
R: env=conf_template
task: env=conf
I think this is good enough. And a different template should trigger reruns. But for really short environment specifically can we simply eg env="conda activate ..."
?
Also I think env
is already a task/action option. We need something else.
Also I think env is already a task/action option. We need something else.
env
will be deprecated if we have the new feature. We do not need multiple features for (almost) the same purpose, even if one is a simplified version of another.
conf
is possible but we are using a template, not a configuration.
template
is a bit long but acceptable, the problem is that it does not say what a template
does here.
So overall env
seems like a good name for such an option.