Deprecating cwlgen and migrating to CWL-Utils
illusional opened this issue · 4 comments
There have been advancements in the schema-salad to generate Python classes that more directly mirror the functionality this project provides. As a consequence, it's probably worth investigating migrating some of the generator tests to the new repo and deprecating this project.
I've used the grammar from common-workflow-language/cwl-utils#5 for these initial tests. From migrating some of the cwlgen tests, I've noticed some differences:
-
Identifiers and derived fields are based on the file path are different (for the tests
test_unit_import_workflow
):- Current cwlgen identifier =
1stWorkflow
- New cwl-utils identifier =
file:///path/to/import_workflow.cwl#1stWorkflow
- This affects identifier derived fields, including:
format
,source
,run
.
- Current cwlgen identifier =
-
Some method similarities (cwlgen → cwl-utils):
get_dict() → save()
parse_cwl(cwlfile)
→load_document(cwlfile)
parse_dict
→ No super clear analogue, but loaded through_RecordLoader(CommandLineTool)
||_UnionLoader((CommandLineToolLoader, ...workflow + other loaders)
Additionally, to avoid conflicting with Python keywords we made some specific keyword changes when creating cwlgen, here's how some at least might be mapped back on the initialiser.
tool_id |
workflow_id | input_id | etc→
id`StepInput
: inputs→
in_`
Other little differences (more investigation required):
Optional fields don't default toNone
on init_, which means you'd always have to include them. Refer to common-workflow-language/schema_salad#262 for more information.- Update 2020-06-01 - This has been resolved and will be released in cwl-utils 0.4
- After
save()
, requirements is a `List[CommentedMap] == ordereddict[] which I don't know how to unit test..- Update 2020-06-01 - Most dict comparisons just work.
The big difference is that any issues can be solved by regenerating the file based on the schema-salad definition of CWL
, so it should always be right. When I get a chance, and once optional init fields are in schema-salad I'd be interested in migrating a real world project to using it.
FYI @mr-c
So we're still waiting for schema_salad to support optional args before cwl-utils
is particularly usable?
@TMiguelT It is usable now, just annoying to have to specify None
everywhere. See common-workflow-language/schema_salad#262 for more details
Dear all,
I've fixed the related issues, and now cwl-utils
is the recommended approach for generating (and parsing) cwl. It's generated right from the spec so you can have confidence it's right and has support for every version of CWL.
My notes above will guide you on the migration path, but a few notes:
- The parameters are named with
camelCase
and notsnake_case
like we've generally used. - Take care if you're migrating to a newer spec, as some classes might have changed names (notably:
InputParameter
->WorkflowInputParameter
) - Don't forget to catch all references of cwlgen, as missing one will cause:
raise RepresenterError('cannot represent an object: %s' % (data,)) ruamel.yaml.representer.RepresenterError: cannot represent an object: <cwlgen.common.CommandInputArraySchema object at 0x1100a5780>
- None of my tests failed because of the
List[CommentedMap] == ordereddict
, I do useself.assertDictEquals
.
I migrated my project Janis, which heavily relied on this library, and it only took a few hours: https://github.com/PMCC-BioinformaticsCore/janis-core/pull/21/files. If you have questions, feel free to tag me in this issue or send me a message. Gitter is the best place for this.
Also, come along this Tuesday to the CWL call where I'll talk about the changes I've made and we'll discuss cwlgen and what will happen to it.
I marked this issue for closing, but reopening as a placeholder.