ibireme/yyjson

Support for shared values between `yyjson_mut_doc`/`yyjson_mut_val`

Closed this issue · 8 comments

Is there a way to share yyjson_mut_val between multiple yyjson_mut_doc/yyjson_mut_val, couldn't find anything that would fit this approach in API docs and examples, but there is a possibility I missed something in function listing and this functionality already exists.

I thought yyjson_mut_obj_add_val would use val as lazy bind without taking its ownership, but this doesn't seem to be the case.
Is there perhaps another function that does what I described or some undocumented way to do this.
If not could this possibly be added if it's not too much to ask for.

I would expect code below to work around the lines of what I described, but it produces unexpected results instead.

#include <yyjson.h>

static yyjson_mut_doc *shareddoc = NULL;

int main(void) {
    // Shared doc
    shareddoc = yyjson_mut_doc_new(NULL);
    yyjson_mut_val *arr = yyjson_mut_arr(shareddoc);
    yyjson_mut_doc_set_root(shareddoc, arr);
    yyjson_mut_val *commonobj = yyjson_mut_obj(shareddoc);
    yyjson_mut_arr_append(arr, commonobj);
    yyjson_mut_obj_add_val(shareddoc,commonobj,"common_a",yyjson_mut_int(shareddoc,5));
    yyjson_mut_obj_add_val(shareddoc,commonobj,"common_b",yyjson_mut_int(shareddoc,0xFF));

    // Looks fine
    printf("shareddoc: %s\n", yyjson_mut_write(shareddoc, 0, NULL)); // [{"common_a":5,"common_b":255}]

    // Customized doc
    yyjson_mut_doc *selectivedoc = yyjson_mut_doc_new(NULL);
    yyjson_mut_val *selarr = yyjson_mut_arr(selectivedoc);
    yyjson_mut_doc_set_root(selectivedoc, selarr);
    yyjson_mut_val *selobj = yyjson_mut_obj(selectivedoc);
    yyjson_mut_arr_append(selarr, selobj);

    // Below causes shareddoc to be overwritten, no difference using yyjson_mut_obj_add()
    yyjson_mut_obj_add_val(selectivedoc,selobj,"address",yyjson_mut_doc_ptr_get(shareddoc, "/0/common_b"));
    printf("shareddoc: %s\n", yyjson_mut_write(shareddoc, 0, NULL)); // [{"address":255,"address":255}]

}

Each yyjson_mut_val is an entry in a circular linked list. It can only be associated with one document at a time.

How would you handle the case of 2 different yyjson_mut_doc objects using two different allocators? Which one would free the element?

I would expect each yyjson_mut_val to be owned by instance passed to any variant of yyjson_mut_obj() call in the first place and get destroyed together with that instance (unless there was an API for changing object ownership explicitly) - so in the case above shareddoc.

From what I've looked at, yyjson_mut_obj_add_val() doesn't make a copy of val and takes existing objects instead and since doc is already specified in yyjson_mut_obj() and its variants for each object being created, there shouldn't be multiple allocators to begin with.

If you meant in case of call like the above yyjson_mut_obj_add_val() - object allocated for key "address" and selobj would be owned by selectivedoc, object at "/0/common_b" being owned by shareddoc.

If this cannot be done with current *add_val() calls, I would expect something along yyjson_mut_obj_bind_val() in order to skip adding/changing object to different doc pool.

I didn't look at how yyjson handles its object pool and whether it does any kind of reference counting that could make this harder to implement, so is there any specific reason this wouldn't work.

Also from what I gather this means yyjson doesn't support this feature currently, right?

yyjson doesn't use reference counting, and each val is directly owned by the doc that created it.

For arrays and objects, yyjson_mut_val acts as a node in a linked list. So you can't put the same yyjson_mut_val into two different arrays or objects. This is mentioned in the documentation as well: https://github.com/ibireme/yyjson/blob/master/doc/API.md#creating-json-document

If you need to add the same val to different arrays or objects, the recommended approach is to use yyjson_mut_val_mut_copy(doc, val) to create a duplicate.

I know about yyjson_mut_val_mut_copy(), but this doesn't really fit my use case.
Is there nothing that could imitate this behaviour, perhaps some of unsafe_*/internal functions if you don't intend on adding support for this.

For my use case, there are thousands of jsons responses created, but most of key:values are reused from single pool, just with different document structures, approach above allows a single instance to modify shared field, say for above example common_a.uni.i64 and json value is updated for all instances that bound to it.

It looks like the current data structures can't do this.
But if you're dealing with a single val, you can share it by temporarily saving and restoring its val->next pointer.

That doesn't really work for my use since then selectivedoc's key values are set to the one used by "/0/common_b" instead, I guess it works if you reuse keys every time.

If this can't be worked around from library enduser point, could you direct me to which parts of library handle those so I could dirty patch this myself, if that isn't too much to ask.
Code is amalgamated, and it's a bit hard to tell all parts that handle ownership of yyjson_mut_val from outsider's perspective.

Sorry to bother you this much about something that isn't even supported.

Here's a simple diagram describing the data structures for mut_doc and mut_val: https://github.com/ibireme/yyjson/blob/master/doc/DataStructure.md#mutable-document

As you can see, both object and array only hold a single element (the last node in the linked list) and don't hold all the elements directly. To achieve your goal, you would need to refactor the entire logic related to mut_doc/mut_val to change object/array into dynamic arrays that directly hold all elements. It seems like a lot of work.

I see, I had a wrong presumption that each key had its value stored as child instead of being key:val pairs in the chain, ex. from diagram "a":55, 55 pointer being stored as child inside "a", key being a separate field and all yyjson_mut_val objects, that were allocated using specific yyjson_mut_doc, created single circular list using .next field or just being a contiguous memory block (ex. .."b"->[4]->44->11->22->33->"a"..).

Sorry, I skimmed over that part of documentation before, thanks for taking the time and explaining this limitation.
Feel free to close this issue.