tomzo/gocd-yaml-config-plugin

Duplicate Key error when using override feature from YAML

Neko-Box-Coder opened this issue · 10 comments

Consider the following sample YAML file (Extracted from one of my projects)

format_version: 10
pipelines:
  TestPipeline:
    group: Default
    label_template: ${COUNT}
    lock_behavior: none
    display_order: -1
    materials:
      TestMaterial:
        git: https://github.com/USER/REPO.git
        branch: master
        shallow_clone: true
        auto_update: false
        destination: ""
    stages:
    - TestStage:
        fetch_materials: false
        keep_artifacts: false
        clean_workspace: true
        approval:
          type: success
          allow_only_on_success: true
        jobs:
          TestJob:
            tasks:
            - exec: &test
                run_if: passed
                command: echo
                working_directory: ""
                arguments:
                - Test Echo
            - exec: 
                <<: *test
                run_if: failed

This will return Duplicate key found 'run_if' on the line where the task is trying to override run_if

According to the YAML standard (See here), it should not duplicate the run_if key but instead override value of run_if.

This might not be the correct place to file the issue but I did see there's a test for this use case in the yamlbeans repo so I am not sure.

YAML Plugin version: 0.14.3-321

GoCD Version: 23.4.0

This plugin appears to have always disabled duplicates on the parser so it seems deliberate but I don't have the history of why.

Hmm, looks like it was done to address #3 in #17 so for now this is intentional.

Yeah, I see what you mean.
But whether it is user-intended or not, this is technically an allowed behavior in YAML, right?

The override feature comes in very handy for jobs or tasks that are mostly the same with slight differences

Maybe having a function for retrieving all the duplicated keys and allowing duplications in some areas?

It's allowed by the YAML spec, yeah. Just an opinionated behaviour of this plugin to avoid accidental misconfiguration since most of the time duplicate keys are mistakes. Overrides via merges are probably the exception here, but I don't think YamlBeans allows distinguishing the two specifically.

Theoretically speaking, a global configuration option could probably be added to the plugin, or a "per-config repo" value to allow opt-in.

I don't think I'd be comfortable changing the default behaviour of the plugin unless Yamlbeans were changed to allow finer-grained semantics (which seems unlikely, yamlbeans needed to be forked to properly release security fixes recently as the original maintainers seem to have lost ability to publish to Maven Central. Sigh.)

The alternate approach I used previously was essentially:

format_version: 10

snippets:
  exec-echo-without-run: &echo-without-run
    command: echo
    working_directory: ""
    arguments:
    - Test Echo

pipelines:
  TestPipeline:
    group: Default
    label_template: ${COUNT}
    lock_behavior: none
    display_order: -1
    materials:
      TestMaterial:
        git: https://github.com/USER/REPO.git
        branch: master
        shallow_clone: true
        auto_update: false
        destination: ""
    stages:
    - TestStage:
        fetch_materials: false
        keep_artifacts: false
        clean_workspace: true
        approval:
          type: success
          allow_only_on_success: true
        jobs:
          TestJob:
            tasks:
            - exec:
                <<: *echo-without-run
                run_if: passed
            - exec: 
                <<: *echo-without-run
                run_if: failed

Maybe having a function for retrieving all the duplicated keys and allowing duplications in some areas?

The plugin delegates to the library to parse YAML and convert to Java objects and it uses hashmaps internally so the keys get overridden on insert. Seems highly unlikely that we want to go down the rabbithole of reworking the YAML parser for this use case.

The example you gave is probably the closest thing that I can have I guess.

Maybe having a function for retrieving all the duplicated keys and allowing duplications in some areas?

I was thinking of raising this as an issue in yamlbean for them to see if they want to add a functionality of returning duplicated keys. Just like what had happened before.

I am not sure if that is out of the realm of possibility or not, since I don't know how much is it for them and you to implement it.

But either way, it will be nice to add this behavior to the readme because it took a bit of time for me to realize this is not supported and I was searching the wrong terms (YAML override).

Personally I have no interest in changing this behaviour, and changing the entire internal logic of a barely maintained library such as yamlbeans seems unrealistic to me. Even the earlier change to yaml beans was submitted by GoCD contributors.

The error message is pretty clear to me - “duplicate key found”. It shouldn’t take more than a minute to determine it’s not supported. YAML is a very complex spec with many features - there’s no reason to expect that any given usage will support all of YAML 1.1 or 1.2 just because it uses YAML. In fact, a number of YAML features are insecure by default and typically disabled despite being defined by spec.

Personally I have no interest in changing this behavior, and changing the entire internal logic of a barely maintained library such as yamlbeans seems unrealistic to me. Even the earlier change to yaml beans was submitted by GoCD contributors.

I can understand and that's fair enough.

The error message is pretty clear to me - “duplicate key found”. It shouldn’t take more than a minute to determine it’s not supported.

Yes, I agree that "duplicate key found" is pretty clear on where or what is wrong, but not why is it wrong.

It wasn't clear to me that YAML merge key was not supported since YAML anchor+merge key is quite common in other applications that uses YAML (such as docker), it was out of my assumption that it is not supported in this plugin.

I had to go through the yaml file as well as official documentation to see if I misstyped or put things in the wrong place.

Obviously, it might look clear with the example I gave, but not so much when it is a large pipeline with a lot of nested indentations.

That's why I was suggesting adding this behavior in readme.

Or maybe catch and reword the error a bit to something like "duplicate / merge key is not supported" ?

Anyway, this is up to you. It's just my opinion when using GoCD with YAML when coming from other CI/CD systems.

It seems like custom key cannot be added to the root, so it needs to be something like this for anyone wants to do similar thing.

format_version: 10

pipelines:
  TestPipeline:
    snippets:
      exec-echo-without-run: &echo-without-run
        command: echo
        working_directory: ""
        arguments:
        - Test Echo
    
    group: Default
    label_template: ${COUNT}
    lock_behavior: none
    display_order: -1
    materials:
      TestMaterial:
        git: https://github.com/USER/REPO.git
        branch: master
        shallow_clone: true
        auto_update: false
        destination: ""
    stages:
    - TestStage:
        fetch_materials: false
        keep_artifacts: false
        clean_workspace: true
        approval:
          type: success
          allow_only_on_success: true
        jobs:
          TestJob:
            tasks:
            - exec:
                <<: *echo-without-run
                run_if: passed
            - exec: 
                <<: *echo-without-run
                run_if: failed