CertainLach/jrsonnet

Tailstrict causes stack overflow

Jezza opened this issue · 4 comments

Jezza commented

I can provide an example that reproduces the segfault, but for the life of me, I have no idea how to start looking into it.

Let me know if it works for you, or if I can provide any other info.

Crashing Jsonnet code
local reduce(func, array, default) =
//  assert std.isFunction(func) : error 'reduce(func=...) should be a `function`';
//  assert std.isArray(array) : error 'reduce(array=...) should be an `array`';
  //  assert std.length(array) > 0 : error 'reduce(array=...) should be a non-0 `array`';
  local aux(func, array, current, index) =
    if index < 0 then
      current
    else
      aux(func, array, func(array[index], current), index - 1) tailstrict;
  local length = std.length(array);
  if length == 0 then
    default
  else
    aux(func, array, array[length - 1], length - 2);

local visit(
  value,
  visitor,
  ctx,
  ctx_reducer,
) =
  local new_item = if std.isObject(value) then
    local merged = {
      [prop]: visit(value[prop], visitor, ctx, ctx_reducer)
      for prop in std.objectFields(value)
    };

    local new_item = {
      [prop]: merged[prop].item
      for prop in std.objectFields(merged)
    };
    local new_ctx = reduce(ctx_reducer, std.map(
      function(item) item.ctx,
      std.objectValues(merged)
    ), ctx);

    {
      item: new_item,
      ctx: new_ctx,
    }
  else if std.isArray(value) then
    local merged = std.map(
      function(item) visit(item, visitor, ctx, ctx_reducer),
      value,
    );

    // If we had a way to tee collections, that would be great... :(
    local new_item = std.map(function(item) item.item, merged);
    local all_ctx = std.map(function(item) item.ctx, merged);

    // Reduce all of the contexts
    local new_ctx = reduce(ctx_reducer, all_ctx, ctx);

    {
      item: new_item,
      ctx: new_ctx,
    }
  else
    {
      item: value,
      ctx: ctx,
    };
  local replacement = visitor(new_item.ctx, new_item.item);
  if replacement != null then
    local new_ctx = ctx_reducer(replacement.ctx, new_item.ctx);

    {
      item: replacement.item,
      ctx: new_ctx,
    }
  else
    new_item;


local routes = {
  "/inc/users/whoami": {
    "get": {
      "responses": {
        "200": {
          "content": {
            "application/json": {
              "schema": {
                "properties": {
                  "user": {
                    "properties": {
                      "#id": "the id this entity",
                      "id": {
                        "format": "uuid",
                        "type": "string",
                        "x-name": "users.User"
                      }
                    },
                    "type": "object",
                  }
                },
              }
            }
          },
        }
      }
    }
  }
};

visit(routes, function(ctx, item) { ctx: ctx, item: item }, {}, function(left, right) left + right)

Thank you for the report!

Do you build jrsonnet in the release mode? Which version of jrsonnet do you use? Can you run this code with --os-stack=50 argument?

For me, this code fails with SIGABRT in debug mode (Due to stack overflow, and this is ok, as most stack size optimizations are targeted for the release mode) and works correctly in release on x86_64-linux-gnu, jrsonnet 0.5.0-pre9, rustc 1.71.0-nightly 2023-05-06.
However, I'm aware there is something wrong with GC/interner implementation, and the problem may be on the jrsonnet side. I will be grateful for any further information.

Jezza commented

It was release mode, but it still broke.
I used the os-stack option, but it feels a bit hacky.

After a LOT more looking around, I've managed to reduce it to:

local new_visit_thing(
  value
) =
  local recurse(value) = 
    recurse(value) tailstrict;
  recurse(value);

new_visit_thing("")

It has something to do with tailstrict.
That's even in release mode.

So it was an SIGABRT, not a SEGFAULT?

tailstrict evaluates function without counting against stack depth limit so that it may cause a stack overflow in any jsonnet implementation (except C++ one, which implements tail-call optimization correctly) by design.

E.g go-jsonnet:
image

The problem with Rust is an inability to format os stack overflow messages correctly: rust-lang/rust#51405

I'll think about providing some handling for this error, but for now, it is better to resort to using os-stack in cases where optimising code for stack size usage is impossible. Maybe I should adjust some things to make this code fail as with --max-stack overflow, instead of failing in os stack overflow.
I think the default OS stack is very small in your case, as the original code requires only 4MB to evaluate in release-mode compiled jrsonnet

You may also solve this issue by switching from the manually-written reduce function to std.foldl/std.foldr, which are written in native, and will not overflow the stack. It is always better to use native functions where possible in jsonnet.

While this isn't looking good, this is just how tailstrict works, this is the reason it isn't even documented in jrsonnet docs.

While it is possible to implement proper TCO, both go-jsonnet and sjsonnet doesn't implement it, and even with partial TCO (As I implemented in #121) it would be still possible to overflow the stack.