oicr-gsi/robust-paper

Command-line parameters vs configuration files

Closed this issue · 1 comments

I'm not so sure the recommendation of command-line parameters over of configuration files is relevant as is it currently stated. IMHO, it depends on the software itself and cannot be generalized.

For instance, in a research field I know quite well (molecular simulations software), there are so many parameters that need to be specified that relying on a command line would be really cumbersome. Moreover, simulations can run for days, and it would be easy to forget the exact set of parameters it was started with.

For this latter point, your recommendation of dumping parameters together with the output of the program makes sense. However, in the early phase of (unstable) software development, this dump might not so reliable because features are likely to be added/removed quite often, and corresponding parameters hardcoded/externalized. This unstable phase is also the most critical with regard to the underlying scientific reality that is investigated, and is therefore its importance is beyond that of software engineering, especially while tracking the scientific performance for even very tiny yet significant improvements.

During this unstable phase, I would personally recommend to keep track of all input parameters separately from those that are echoed, and compare them for the sake of consistency. In such a situation, configuration files are way easier to handle than command-line parameters, as their comparison with outputs can be automatized, as they are much more explicit and can be included in a version control systems.

Consequently, I think it would be relevant to add to the current article the idea that the trade-off of passing parameters with the command-line vs configuration files should be carefully considered all along the software development process, which has distinct phases, as well as depending on its usage. The primary goal is to stick to the scientific reality (hence reproducibility, version tracking, and explicit configuration files), then open up to further convenience for users as the user base grows (hence convenient parameters definition).

Having distinct yet concurrent software development phases (early investigation -> prototyping -> usable software) seems to be ignored by most of my colleagues. Since that’s the one of the key selling point of the article - transforming prototypes into usable software -, I wonder if it might be worth, beyond the 10 relevant and valuable rules you mention, to add a specific paragraph about these phases. If needed, I can propose one with a pull request.

A

Thanks for your feedback! It's been incorporated.