miriamkw/GluPredKit

Suggestion: File path arguments for `generate_config` to accept only file names

Closed this issue · 1 comments

Following the usage guide at the generate-model-training-configuration step, I ran into a minor issue with the input paths requirements for the glupredkit generate_config CLI function.

The preferred usage of the CLI is to navigate to the working directory containing the data folder structure, and all file path should be auto resolved. However, some users may provide the full path, leading to an error message that's not very clear:

Example:

glupredkit generate_config --file-name data/configurations/synthetic_configs --data data/raw/synthetic_data_with_datetime.csv --prediction-horizon 5 --num-lagged-features 12 --num-features CGM,insulin,carbs --cat-features is_test

Error:

ValueError: Data file 'data/raw/synthetic_data_with_datetime.csv' not found in 'data/raw/' folder.

The error message might be confusing to certain users as it implies the tool is unable to find synthetic_data_with_datetime.csv in data/raw, even when it is.

A suggestion is to validate --file-name and --data args to only contain file names and not include any directory paths, as that would directly inform the user on the correct usage. I'd also suggest to include default or example values for each argument when the user runs glupredkit generate_config --help.

JOSS review: openjournals/joss-reviews#6904

Good point, thank you!

I solved this in the following way:

  • Added examples to the "help" statements in the CLI
  • Made the commands flexible towards included paths (pro with this solution is that people don't have to repeat their command if they do it "wrong", but the con is that users might add the full path unneccessarily)