Garrafao/WUGs

Input Options WUG Pipeline

Opened this issue · 5 comments

Which parameters can be passed to the pipeline to filter / specify the process? In which format should the parameters be presented? For which sub-processes of the pipeline are the parameters needed?

Almost all parameters are specified in the files parameters_*.sh, e.g.:

https://github.com/Garrafao/WUGs/blob/main/scripts/parameters_system2.sh

The pipeline loads the specified parameter file at the beginning. For the system pipeline, currently 3 parameters can be overwritten with input parameters to the shell script: directory, clustering algorithm, node positioning:

https://github.com/Garrafao/WUGs/blob/main/scripts/run_system2.sh

Could you specify if all these parameters should also be changeable in DURel. If no, please specify which parameters should stay as is.

Could you change https://github.com/Garrafao/WUGs/blob/main/scripts/run_system2.sh in the following way:
There should be an additional optional parameter to the script, namely the file with the parameters. This way, I can pass a file akin to parameters_system2.sh from DURel. If no such file is passed, the dafault option should be the existing parameters_system2.sh.

Could you change https://github.com/Garrafao/WUGs/blob/main/scripts/run_system2.sh in the following way: There should be an additional optional parameter to the script, namely the file with the parameters. This way, I can pass a file akin to parameters_system2.sh from DURel. If no such file is passed, the dafault option should be the existing parameters_system2.sh.

This was changed with the last commit. run_system2.sh now takes an obligatory input parameter for the parameter file (see example in test.sh). Variables from this parameter file can be overridden by the other input parameters.

Could you specify if all these parameters should also be changeable in DURel. If no, please specify which parameters should stay as is.

I will specify those which we don't need from scripts/parameters_system2.sh, as they are less:

annotators
modus
graphtype
isanonymize
templatepath

Please note that the remaining parameters are only a preliminary list which will have to change iteratively in the next weeks.

Please note this closed issue:

#28

The data aggregation step is now separate and has a number of parameters, see here:

https://github.com/Garrafao/WUGs/blob/main/scripts/parameters_system2.sh

Currently, only these parameters can be provided to run_system2.sh:

bash -e scripts/run_system2.sh $dir $algorithm $position $parameterfile

The rest is loaded from the parameter files. I wonder this: as we plan to have e.g. the data aggregation step run separately, does it make sense to add all its parameters to the full pipeline script scripts/run_system2.sh? I think not, because if this step is run separately, we can just run the corresponding Python script scripts/data2graph.sh which takes all the relevant parameters.