yaml-set: Allow array.append using index [+] or [*]
azrdev opened this issue · 4 comments
Is your feature request related to a problem? Please describe.
Having read #56 I see the complexity of modifying a nested array/hash data structure, and why it's left to yaml-merge.
However, I humbly think some simple use cases could be brought to yaml-set instead of necessitating to build a full document for a merge: E.g. appending a new element to existing arrays (without knowing how many elements are already), and possibly creating arrays to append to, too.
Describe the solution you'd like
Indexing [-1]
yields the last element, as expected, so could be used for creating a new array with the given element, but not append to an existing.
AIUI indexing [*]
is not yet used and could be for this purpose, alternatively [+]
.
Describe alternatives you've considered
The current way is to build a full yaml document, and then yaml-merge the two. According to the docs, this has the drawback of stripping all comments and empty lines.
Additional context
yamlpath version 3.6.1
This may be a failing in the documentation. yaml-merge
does not require a "full yaml document". The following is the intended experience for this specific use-case (append arbitrary elements to an Array):
$ yaml-merge --version
yaml-merge 3.6.1
$ cat lhs-array.yaml
---
an_array:
- alpha
- beta
- charlie
$ echo delta | yaml-merge --mergeat=/an_array lhs-array.yaml -
---
an_array:
- alpha
- beta
- charlie
- delta
As you can see, the new element was appended to the end of the target Array. The trick is to set --mergeat
(-m
) to whatever Array you wish to append to.
Using yaml-merge
to append data or otherwise merge two documents together -- whole or fragments -- grants other benefits well beyond the capabilities of yaml-set
. Consider the case of wishing for only unique data to be appended to an Array. The yaml-merge
tool handles cases like this as well as vastly more complex cases:
$ yaml-merge --version
yaml-merge 3.6.1
$ cat lhs-dupes.yaml
---
another_array:
- uno
- dos
- tres
$ echo -e "- quatro\n- dos" | yaml-merge --arrays=unique --mergeat=/another_array lhs-dupes.yaml -
---
another_array:
- uno
- dos
- tres
- quatro
Note that the presence of --arrays=unique
prevented the duplicate dos
entry from being appended.
There is another crucial reason to consider all document append operations as a "merge" rather than a "set": YAML supports Anchors and Aliases, which must be handled very carefully so as to avoid document corruption. This is very complex and is one of the core reasons why yaml-merge
exists apart from the trivial yaml-set
tool. In fact, yaml-merge
can handle Anchors and Aliases as inputs directly from the command-line. Consider the following:
$ yaml-merge --version
yaml-merge 3.6.1
$ cat lhs-complex.yaml
---
aliases:
- &a_value alpha
complex_array_1:
- *a_value
complex_array_2:
- ichi
- *a_value
- ni
$ echo "&b_value bravo" | yaml-merge --mergeat='/*' lhs-complex.yaml -
---
aliases:
- &a_value alpha
- &b_value bravo
complex_array_1:
- *a_value
- *b_value
complex_array_2:
- ichi
- *a_value
- ni
- *b_value
In this contrived example, the user practices "One Version of the Truth" and needed to add the same scalar value to multiple Arrays at the same time. In this case, an Anchor is defined and applied to multiple target Arrays via its Aliases (with a single command). Beyond this, any change to /aliases[&b_value]
is automatically applied everywhere "*b_value" exists.
The yaml-set
tool exists to trivially change the value of pre-existing data elements. While it is capable of generating just enough document structure to create a novel value at a previously non-existent YAML Path, it is not intended to handle the complexities of careful document merging.
In another light, adding a new YAML Path segment like [+]
or [*]
must be considered in the greater context of what such a segment would mean to other use-cases for YAML Paths. What should the yaml-get
command do when it receives a YAML Path like "some.path.to[+]"?
I hope this information helps solve your use-case to your satisfaction. If however, you still strongly feel that yaml-set
should allow the creation of arbitrary elements to the end of Arrays, I'd like to hear your additional thoughts.
Thanks for your extensive and quick reply!
The --mergeat
trick is nice, and makes this workaround more feasible.
I'm mostly worried about a merge stripping all comments and (whitespace) formatting, since that round-trip capability is one of the selling points of yamlpath.
My usecase is a python application which modifies an ansible inventory (host_vars/$host.yml) and adds any number of variables specified on command line. By utilizing yamlpath as a library I just pass the varname/path to set_value
and have it do all the heavy lifting -- except for appending to arrays.
In another light, adding a new YAML Path segment like
[+]
or[*]
must be considered in the greater context of what such a segment would mean to other use-cases for YAML Paths. What should theyaml-get
command do when it receives a YAML Path like "some.path.to[+]"?
Indeed this would open a class of YAML Paths which are only valid for modification, but not querying, so would need to rise an exception if used in a query. I'd understand if you were reluctant to add that possibility to yamlpath.
Thanks for your extensive and quick reply!
The
--mergeat
trick is nice, and makes this workaround more feasible.
I wouldn't think of this as a "workaround"; it is by deliberate design, necessitated by the inherent complexities of YAML's Anchor/Alias and Merge Key features. Whereas yaml-set
is designed for atomic, trivial, scalar operations, yaml-merge
is vastly more robust. To wit, yaml-merge
can be used instead of yaml-set
in most use-cases, though its relatively greater capabilities come with a burden of more granular configuration.
I'm mostly worried about a merge stripping all comments and (whitespace) formatting, since that round-trip capability is one of the selling points of yamlpath.
You may have missed the --preserve-lhs-comments
(-l
) option to yaml-merge
or a Boolean preserve_lhs_comments
property on the args object you can pass to MergerConfig
. It is briefly discussed in the documentation and preserves all original documentation in the left-most document. I still discard all right-hand-side documentation due to comment-handling limitations of ruamel.yaml
, which yamlpath is based upon. Let me know if you need some sample code to set this up.
My usecase is a python application which modifies an ansible inventory (host_vars/$host.yml) and adds any number of variables specified on command line. By utilizing yamlpath as a library I just pass the varname/path to
set_value
and have it do all the heavy lifting -- except for appending to arrays.
I have received user stories from people using this project in everything from Ansible to CloudFormations to Cloudify, Puppet, and others. I'm very happy this project is useful to so many people, including you. To this end, I enjoy discussions such as this, ever expanding and refining the usefulness of this project.
In another light, adding a new YAML Path segment like
[+]
or[*]
must be considered in the greater context of what such a segment would mean to other use-cases for YAML Paths. What should theyaml-get
command do when it receives a YAML Path like "some.path.to[+]"?Indeed this would open a class of YAML Paths which are only valid for modification, but not querying, so would need to rise an exception if used in a query. I'd understand if you were reluctant to add that possibility to yamlpath.
I'm no fan of adding a formal YAML Path segment which is only useful to one particular use-case. Everywhere possible, I try hard to only add segments which are applicable to all get/set/merge/delete operations.
I'm closing this issue as resolved by way of illustrating the by-design solution to this need.