sbcgua/ajson

Set_string does not keep_item_order

albertmink opened this issue · 12 comments

We found that set_string calls set the order of the affected node always to 0. We expected that the set_string does not touch the order of the nodes.

Before the set_string call in line 34 the order is 2
image

and after it is set to 0?
image

Code snippet to reproduce:

TYPES:
  BEGIN OF ty_header_60_src,
    description           TYPE string,
    original_language     TYPE string,
    abap_language_version TYPE string,
  END OF ty_header_60_src,

  BEGIN OF ty_main,
    header TYPE ty_header_60_src,
  END OF ty_main.

DATA:
  ajson TYPE REF TO zcl_ajson,
  data  TYPE ty_main.

data = VALUE #( header = VALUE #( description = 'description'
                                  abap_language_version = 'standard'
                                  original_language = 'E' ) ).

ajson = zcl_ajson=>create_empty( zcl_ajson_mapping=>create_camel_case( iv_first_json_upper = abap_false ) ).
ajson->keep_item_order( ).
ajson->set(
  iv_path = '/'
  iv_val  = data ).


ajson->set_string( iv_path = `/header/originalLanguage`  iv_val = 'en' ).

Yes, looks like a bug. Though not very trivial. And, imho, rather edge.
But before starting a potentially long ordeal to fix it ... why do you need keep_item_order ? Is it really necessary for your case ? The output without this flag should be predictable (because node list is path/name sorted). The keep_item_order feature works just for structures, so it is not 100% consistent for all cases (e.g. I want to add string by string - does it makes sense to keep the order ? maybe, but it won't work for the moment).
As I remember keep_item_order feature was request by @larshp to serialize db tables, where it makes a bit more sense (as the user may have got used to what they see in SE16). But for interral structures .... mmm .... very arguable. For AG serialization, imho, repeatable predicable order would matter but not structure order. It just increases the complexity unnecessarily (and potentially impacts the performance a bit)

P.S. tech notes:

  • zif_ajson~set deletes subtree if the node exists
  • so in theory it should either analyse the structure and remember the order of topmost (only topmost) item (in the case it will be a value node) and then apply it to a newly converted topmost (and only topmost) node. What if the structure is deep?
  • or, it should analyse that the target node exists and it is a value node and the new value is value and not a structure/table and then copy value to the existing record. But. The value may change the type also. So ... kind of complicated ...

Thanks for your analysis. To answer your question: We need it predominantly for code-reviews, I guess.

Though need is maybe not the right word. It would be natural if serialization does conserve the order in the ABAP source code, see AFF type/structure definition https://github.com/SAP/abap-file-formats/blob/f5d075a279e675cfc834b7b132a88a829fce3cac/file-formats/zif_aff_types_v1.intf.abap#L84-L93
What if different ABAP file format serialiser, e.g. AG or others, come up with different ordering of JSON data? AG might not be the only serialisier for ABAP file formats (AFF) out there. So eventually, there will be bad diffs when interacting.

ABAP file format does not specify any order, maybe it should?

Current impact:
The described scenario where 'reordering' occurs, does not happen that often (only for the two lines above, the rest of the data is fine). However, it applies to all object types, so it is relevant but probably not very urgent at that state.

well, I think it just strengthen the idea of "not keeping the order" imho :)
Other serializers may have even less controllable. E.g. JS does not guarantee the order of the object attributes (unless you use Map). Ajson does sort by name, so it's predictable at least. Frankly, I think that sorting issues are unavoidable in this context.
Even in AG code there are always code changes because of "SE80 reorderings". And, well, it is annoying, I agree. But I'm not sure there is "the best" solution. And I'm not sure think keeping the structure order is the best solution for AG serialization. It is very fragile.

in abapGit, the current xml serialization serializes uses item order, I expect the same for json. I think this is also the case for most(all?) javascript implementations of JSON.stringify()

suggestion: If the node exists, dont remove it, just update the VALUE? throw an exception if its not a basic value(object/array)

image

suggestion: add a concept for custom value mapping, like ZIF_AJSON_MAPPING, but for mapping basic values instead of field names

stringify is non deterministic from what I know. https://stackoverflow.com/questions/42491226/is-json-stringify-deterministic-in-v8
Here is a quick test by the way (numbers added late but serialized first)

Welcome to Node.js v16.14.2.
> a = {}
{}
> a.a = 1
1
> a.b = 2
2
> JSON.stringify(a)
'{"a":1,"b":2}'
> b = {}
{}
> b.b = 2
2
> b.a = 1
1
> JSON.stringify(b)
'{"b":2,"a":1}'
> b[2]=3
3
> b[1]=4
4
> JSON.stringify(b)
'{"1":4,"2":3,"b":2,"a":1}'
>

suggestion: If the node exists, dont remove it, just update the VALUE? throw an exception if its not a basic value(object/array)
well ... it

  • A) decreases performance (because of extra type check) (the impact is unknown without measuring though)
  • B) actually contradicts the concept of JSON, why throw if type is changed ?

But why item order is so critical ? Why name-sorted is bad ? What is the valuable benefit ? E.g. in case of internal abap structures - what percentage of coder knows (and got used to !) the real order ?

suggestion: add a concept for custom value mapping, like ZIF_AJSON_MAPPING, but for mapping basic values instead of field names

a bit heavy, but could be an option. Kind of convexit ?

hmm, yea, perhaps the use-case just doesnt fit well with ajson. In abapGit we dont need the full generic javascript functionality, everything is statically typed, and will probably be read or written from/to ABAP data right away. So another option is to write another JSON serializer/deserializer

JSON itself does not enforce any order. However, there's a given order in a JSON schema. That's why most package.json start with name, version, description, for example.

For me, going back and forth between ABAP and JSON is basically the same as having the schema defined in an ABAP data structure. It only makes sense to preserve the field order for this use case. Anything else - although syntactically correct JSON - is counterintuitive.

PS: It's "(A)BAP JSON" so how JSON works in JS should not matter.

Should work after #115 , @albertmink please try that branch.

though ... I'm still puzzled why structure order is so important ... :)

For me, the issue is finished ✅