AGC run fails when empty strings in inputs.json file
biofilos opened this issue · 4 comments
Describe the Bug
If the inptus.json of a workflow contains empty strings
{
"wf.something": 1,
"wf.thisWillCrash": ""
}
When I run agc workflow run
, the command will fail complaining that the directory where the input.json file is located is a directory. For some reason, the command crops out the file name of the inputs.json and continues processing the directory as if it was the inputs.json.
Although this bug should be fixed (or explained this behavior in the documentation), a work-around is to avoid empty strings in the input.json, and use empty strings as default values in the wdl file itself, or use non-empty values in the inputs.json, and catch them with logic in the wdl file.
Steps to Reproduce
- Include empty strings as value of a input.json
- Run
agc workflow run wf_name --inputsFile input.json
Relevant Logs
When running the workflow with empty strings in input.json
2022-10-27T17:06:29+08:00 𝒊 Running workflow. Workflow name: 'variants', InputsFile: 'inputs/variants.inputs.json', OptionFile: '', Context: 'ctx1'
2022-10-27T17:06:30+08:00 ✘ error="unable to run workflow: unable to sync s3://agc-xxxxxx-ap-xxx-1/project/testProject/userid/userSDFHG/data: upload multipart failed, upload id: uPenv1AiRi0I55mR1X48ppYXYmHNy.t_0uWPDkIC62xkPlHzHGchJvBcA9Vh7isnJqpLZvw6.N8XC2OHQ2_zXtfQc1cuxc_0_BE2zgNQ_0f2u1dkkUMm_czBd86pZPDQ9gF_USm7D69KGdNgO6GUtg--, cause: operation error S3: UploadPart, failed to compute payload hash: failed to compute payload hash, read /home/user/Documents/projects/scalable-workflows/aws/inputs/: is a directory"
Error: an error occurred invoking 'workflow run'
with variables: {WorkflowName:variants Arguments:inputs/variants.inputs.json OptionFile: ContextName:ctx1}
caused by: unable to run workflow: unable to sync s3://agc-662002918436-ap-southeast-1/project/testProject/userid/jfortiz4vr6sV/data: upload multipart failed, upload id: uPenv1AiRi0I55mR1X48ppYXYmHNy.t_0uWPDkIC62xkPlHzHGchJvBcA9Vh7isnJqpLZvw6.N8XC2OHQ2_zXtfQc1cuxc_0_BE2zgNQ_0f2u1dkkUMm_czBd86pZPDQ9gF_USm7D69KGdNgO6GUtg--, cause: operation error S3: UploadPart, failed to compute payload hash: failed to compute payload hash, read /home/user/Documents/projects/scalable-workflows/aws/inputs/: is a directory
Expected Behavior
Workflow runs
Actual Behavior
AGC fails claiming that it can not compute the payload hash of the directory where the input.json file is
Screenshots
Additional Context
Operating System: Ubuntu 22.10
AGC Version: 1.5.1
Was AGC setup with a custom bucket: No
Was AGC setup with a custom VPC: No
Confirmed the bug by adding a "wf.thisWillCrash": ""
line to this inputs.json.
We will add fixing this to our backlog.
We could certainly provide a more informative error message.
Is there a useful reason to provide an empty input? Wouldn't it be better to make the input optional in the WDL?
Yes.
We rely on annotation from our users, and we validate all input json files, so all fields for a specific workflow should be present. Adding that level of validation inside the wdl itself just increase the complexity of the workflow unnecessarily.
If this is a non-fixable bug, it would be good to document that no empty fields are allowed in the wdl file, so we can act accordingly
It should be fixable. Was mainly wondering if we should allow empty values or just emit a better error. Seems like there's a case for empty values so we should allow them.