jsonata-js/jsonata

Unexpected array access behaviour

Jaspooky opened this issue · 1 comments

Array indexing appears to return inconsistent results when an expression is inlined vs stored in a var and the var used, despite the expression returning a number.

MRE (Playground Link - version 2.0.5):

(
    $arr := ['Hello', 'Hi', 'Hey'];
    $idx := $floor($random() * $count($arr));
    {
        /* Consistently assigns a single string value */
        "working": $arr[$idx],
        /* Inconsistently returns no match, a single string value, or an array of string values? */
        "not_working": $arr[$floor($random() * $count($arr))]
    };
)

I can see a note in the array navigation docs saying

If the square brackets contains a number, or an expression that evaluates to a number, then the number represents the index of the value to select. Indexes are zero offset, i.e. the first value in an array arr is arr[0]. If the number is not an integer, then it is rounded down to an integer. If the expression in square brackets is non-numeric, or is an expression that doesn't evaluate to a number, then it is treated as a predicate.

Given $idx is always a number, it seems sensible to assume the expression should evaluate and be used as a normal array index returning a single value.

The documentation needs to be improved/clarified. The filter expression gets evaluated for each item in the input sequence (context). It has to do that because the expression might be relative to this context (in your case, it isn't). So what's happening is the $random() function gets invoked three times, once for each item, which is why you see any or none of the items selected.

The docs are a bit misleading in this scenario. They were written more as a user guide rather than a formal specification in order to describe the intent of numeric expressions. Unfortunately, $random() breaks the usual rules of pure functions (i.e. the result is not purely a function of its parameters).

The XPath spec (which JSONata takes its inspiration from) is more precise about this:

A PredicateExpr is evaluated by evaluating the Expr and converting the result to a boolean. If the result is a number, the result will be converted to true if the number is equal to the context position and will be converted to false otherwise; if the result is not a number, then the result will be converted as if by a call to the boolean function. Thus a location path para[3] is equivalent to para[position()=3].

I'll come up with some more precise wording for the docs.