danielaparker/jsoncons

Control amount of memory reserved in ctor of json_decoder

Closed this issue · 6 comments

betp commented

Describe the proposed feature
The proposed feature is to have a mechanism to control how much memory the constructor of json_decoder reserves. The mechanism would be made available to the callers of the json::parse() function.

Reason: In my application, I need to parse many small json strings. When profiling the application, I see a lot of CPU time is spent in the constructor of json_decoder on these lines:

item_stack_.reserve(1000);
structure_stack_.reserve(100);

I assume this much memory is reserved because the focus of jsoncons is on parsing large json strings. As the json strings I need to parse are very simple, it should be sufficient to reserve much less memory, which would lead to less CPU time spent on the allocations and less memory fragmentation.

What other libraries (C++ or other) have this feature?
I don't know

Include a code fragment with sample data that illustrates the use of this feature

  • The parse() function could have two additional parameters (that would be defaulted to the original values), that would be passed into the constructor of json_decoder and control the amount of memory reserved. The caller could then do:
const size_t item_stack_reserve_size = 10;
const size_t structure_stack_reserve_size = 5;
auto json = jsoncons::json::parse(src, item_stack_reserve_size, structure_stack_reserve_size);
  • The parse() could instead take a hint of how big the json string to parse is:
auto json = jsoncons::json::parse(src, jsoncons::json::JsonStringSize::SMALL);
  • etc :)

Can you provide an example of how you're currently using jsoncons, along with a representative sample JSON file? Thanks.

betp commented

Sure:

// representative sample JSON message, e.g. received over network
std::string jsonStr = "{\"id\":1,\"version\":\"2.0\",\"result\":{\"value\":\"42\"}}";

const auto json = jsoncons::json::parse(jsonStr);

const auto& result = json.at_or_null("result");
if (!result.is_null())
{
    const auto& value = result.at_or_null("value");
    if (!value.is_null() && value.is_string())
    {
        doSomethingWithValue(value.as_string());
    }
}

You could reduce allocations a lot by reusing a json_decoder and a json_parser, and, assuming C++17, accessing "value" as a std::string_view, e.g.

void doSomethingWithValue(std::string_view val)
{
    std::cout << val << "\n";
}

int main()
{
    jsoncons::json_decoder<jsoncons::json> decoder;
    jsoncons::json_parser parser;

    for (std::size_t i = 0; i < 5; ++i)
    {
        std::string jsonStr = "{\"id\":1,\"version\":\"2.0\",\"value\":\"" + std::to_string(i) + "\"}";
        parser.update(jsonStr.data(), jsonStr.size());
        parser.parse_some(decoder);
        parser.finish_parse(decoder);
        parser.check_done();
        if (decoder.is_valid())
        {
            jsoncons::json json = decoder.get_result();
            const auto& value = json.at_or_null("value");
            if (value.is_string())
            {
                doSomethingWithValue(value.as<std::string_view>());
            }
        }
        decoder.reset();
        parser.reset();
    }
}
betp commented

Thank you for the suggestion. Good to know the json_decoder and json_parser can be reused, I will consider this.

However, implementing this approach would cause quite some changes in my code base. The parsing of the json strings happens in many places/functions. I would probably wrap the json_decoder, json_parser and the code above by a class, but I would still need to manage the lifetime of the objects of that class and pass them into all the functions that need to parse a json string. That would complicate the logic of the code.

Do you see it realistic you would extend the interface of the parse() method, as suggested above?

I'm reluctant to add more overrides to the json::parse function, it already has 18. But to help with the issue, I've reduced the limits for the initial buffer capacity and the initial stack depth constants to 256 and 66. I've also added an additional limit such that the initial stack depth won't exceed the max_nesting_depth (set in options) + 2. So with small json objects, you can control the initial stack depth by setting a suitably small max_nesting_depth, e.g.

std::string str = R"(
{
    "foo" : [1,2,3],
    "bar" : [4,5,{"f":6}]
})";

auto options = jsoncons::json_options{}
    .max_nesting_depth(3);
auto j = jsoncons::json::parse(str);
betp commented

Thank you for the changes!