Duplicate Key error when using override feature from YAML
Neko-Box-Coder opened this issue · 10 comments
Consider the following sample YAML file (Extracted from one of my projects)
format_version: 10
pipelines:
TestPipeline:
group: Default
label_template: ${COUNT}
lock_behavior: none
display_order: -1
materials:
TestMaterial:
git: https://github.com/USER/REPO.git
branch: master
shallow_clone: true
auto_update: false
destination: ""
stages:
- TestStage:
fetch_materials: false
keep_artifacts: false
clean_workspace: true
approval:
type: success
allow_only_on_success: true
jobs:
TestJob:
tasks:
- exec: &test
run_if: passed
command: echo
working_directory: ""
arguments:
- Test Echo
- exec:
<<: *test
run_if: failed
This will return Duplicate key found 'run_if'
on the line where the task is trying to override run_if
According to the YAML standard (See here), it should not duplicate the run_if
key but instead override value of run_if
.
This might not be the correct place to file the issue but I did see there's a test for this use case in the yamlbeans repo so I am not sure.
YAML Plugin version: 0.14.3-321
GoCD Version: 23.4.0
This plugin appears to have always disabled duplicates on the parser so it seems deliberate but I don't have the history of why.
Yeah, I see what you mean.
But whether it is user-intended or not, this is technically an allowed behavior in YAML, right?
The override feature comes in very handy for jobs or tasks that are mostly the same with slight differences
Maybe having a function for retrieving all the duplicated keys and allowing duplications in some areas?
It's allowed by the YAML spec, yeah. Just an opinionated behaviour of this plugin to avoid accidental misconfiguration since most of the time duplicate keys are mistakes. Overrides via merges are probably the exception here, but I don't think YamlBeans allows distinguishing the two specifically.
Theoretically speaking, a global configuration option could probably be added to the plugin, or a "per-config repo" value to allow opt-in.
I don't think I'd be comfortable changing the default behaviour of the plugin unless Yamlbeans were changed to allow finer-grained semantics (which seems unlikely, yamlbeans needed to be forked to properly release security fixes recently as the original maintainers seem to have lost ability to publish to Maven Central. Sigh.)
The alternate approach I used previously was essentially:
format_version: 10
snippets:
exec-echo-without-run: &echo-without-run
command: echo
working_directory: ""
arguments:
- Test Echo
pipelines:
TestPipeline:
group: Default
label_template: ${COUNT}
lock_behavior: none
display_order: -1
materials:
TestMaterial:
git: https://github.com/USER/REPO.git
branch: master
shallow_clone: true
auto_update: false
destination: ""
stages:
- TestStage:
fetch_materials: false
keep_artifacts: false
clean_workspace: true
approval:
type: success
allow_only_on_success: true
jobs:
TestJob:
tasks:
- exec:
<<: *echo-without-run
run_if: passed
- exec:
<<: *echo-without-run
run_if: failed
Maybe having a function for retrieving all the duplicated keys and allowing duplications in some areas?
The plugin delegates to the library to parse YAML and convert to Java objects and it uses hashmaps internally so the keys get overridden on insert. Seems highly unlikely that we want to go down the rabbithole of reworking the YAML parser for this use case.
The example you gave is probably the closest thing that I can have I guess.
Maybe having a function for retrieving all the duplicated keys and allowing duplications in some areas?
I was thinking of raising this as an issue in yamlbean for them to see if they want to add a functionality of returning duplicated keys. Just like what had happened before.
I am not sure if that is out of the realm of possibility or not, since I don't know how much is it for them and you to implement it.
But either way, it will be nice to add this behavior to the readme because it took a bit of time for me to realize this is not supported and I was searching the wrong terms (YAML override).
Personally I have no interest in changing this behaviour, and changing the entire internal logic of a barely maintained library such as yamlbeans seems unrealistic to me. Even the earlier change to yaml beans was submitted by GoCD contributors.
The error message is pretty clear to me - “duplicate key found”. It shouldn’t take more than a minute to determine it’s not supported. YAML is a very complex spec with many features - there’s no reason to expect that any given usage will support all of YAML 1.1 or 1.2 just because it uses YAML. In fact, a number of YAML features are insecure by default and typically disabled despite being defined by spec.
Personally I have no interest in changing this behavior, and changing the entire internal logic of a barely maintained library such as yamlbeans seems unrealistic to me. Even the earlier change to yaml beans was submitted by GoCD contributors.
I can understand and that's fair enough.
The error message is pretty clear to me - “duplicate key found”. It shouldn’t take more than a minute to determine it’s not supported.
Yes, I agree that "duplicate key found" is pretty clear on where or what is wrong, but not why is it wrong.
It wasn't clear to me that YAML merge key was not supported since YAML anchor+merge key is quite common in other applications that uses YAML (such as docker), it was out of my assumption that it is not supported in this plugin.
I had to go through the yaml file as well as official documentation to see if I misstyped or put things in the wrong place.
Obviously, it might look clear with the example I gave, but not so much when it is a large pipeline with a lot of nested indentations.
That's why I was suggesting adding this behavior in readme.
Or maybe catch and reword the error a bit to something like "duplicate / merge key is not supported" ?
Anyway, this is up to you. It's just my opinion when using GoCD with YAML when coming from other CI/CD systems.
It seems like custom key cannot be added to the root, so it needs to be something like this for anyone wants to do similar thing.
format_version: 10
pipelines:
TestPipeline:
snippets:
exec-echo-without-run: &echo-without-run
command: echo
working_directory: ""
arguments:
- Test Echo
group: Default
label_template: ${COUNT}
lock_behavior: none
display_order: -1
materials:
TestMaterial:
git: https://github.com/USER/REPO.git
branch: master
shallow_clone: true
auto_update: false
destination: ""
stages:
- TestStage:
fetch_materials: false
keep_artifacts: false
clean_workspace: true
approval:
type: success
allow_only_on_success: true
jobs:
TestJob:
tasks:
- exec:
<<: *echo-without-run
run_if: passed
- exec:
<<: *echo-without-run
run_if: failed