Input Options WUG Pipeline

Question

Input Options WUG Pipeline

Opened this issue a year ago · 5 comments

Which parameters can be passed to the pipeline to filter / specify the process? In which format should the parameters be presented? For which sub-processes of the pipeline are the parameters needed?

Answer 1 · 2023-07-12T07:56:11.000Z

Almost all parameters are specified in the files parameters_*.sh, e.g.:

https://github.com/Garrafao/WUGs/blob/main/scripts/parameters_system2.sh

The pipeline loads the specified parameter file at the beginning. For the system pipeline, currently 3 parameters can be overwritten with input parameters to the shell script: directory, clustering algorithm, node positioning:

https://github.com/Garrafao/WUGs/blob/main/scripts/run_system2.sh

Answer 2 · 2023-07-25T09:12:30.000Z

Could you specify if all these parameters should also be changeable in DURel. If no, please specify which parameters should stay as is.

Could you change https://github.com/Garrafao/WUGs/blob/main/scripts/run_system2.sh in the following way:
There should be an additional optional parameter to the script, namely the file with the parameters. This way, I can pass a file akin to parameters_system2.sh from DURel. If no such file is passed, the dafault option should be the existing parameters_system2.sh.

Answer 3 · 2023-08-09T08:50:21.000Z

Could you change https://github.com/Garrafao/WUGs/blob/main/scripts/run_system2.sh in the following way: There should be an additional optional parameter to the script, namely the file with the parameters. This way, I can pass a file akin to parameters_system2.sh from DURel. If no such file is passed, the dafault option should be the existing parameters_system2.sh.

This was changed with the last commit. run_system2.sh now takes an obligatory input parameter for the parameter file (see example in test.sh). Variables from this parameter file can be overridden by the other input parameters.

Answer 4 · 2023-08-09T18:42:12.000Z

Could you specify if all these parameters should also be changeable in DURel. If no, please specify which parameters should stay as is.

I will specify those which we don't need from scripts/parameters_system2.sh, as they are less:

annotators
modus
graphtype
isanonymize
templatepath

Please note that the remaining parameters are only a preliminary list which will have to change iteratively in the next weeks.

Answer 5 · 2023-08-09T18:48:16.000Z

Please note this closed issue:

#28

The data aggregation step is now separate and has a number of parameters, see here:

https://github.com/Garrafao/WUGs/blob/main/scripts/parameters_system2.sh

Currently, only these parameters can be provided to run_system2.sh:

bash -e scripts/run_system2.sh $dir $algorithm $position $parameterfile

The rest is loaded from the parameter files. I wonder this: as we plan to have e.g. the data aggregation step run separately, does it make sense to add all its parameters to the full pipeline script scripts/run_system2.sh? I think not, because if this step is run separately, we can just run the corresponding Python script scripts/data2graph.sh which takes all the relevant parameters.