cburgmer/json-path-comparison

Should `$[?(@.key)]` distinguish between undefined and null in Proposal A?

cburgmer opened this issue · 8 comments

It seems there is no consensus whatsoever on "filter with value". I've tried catching a variety of types in
https://cburgmer.github.io/json-path-comparison/results/filter_expression_with_value.html, and you can see a mix of responses, with or without

  • Empty array, object, string
  • false
  • null,
  • undefined key
  • 0

My reasoning to reject only the undefined key case for Proposal A was that there is no other way to implement that with the current set in JSONPath. So this would give me the most flexibility.

However this leads to query $[?(@)] becoming completely pointless, because all elements in an array are defined.

Also, I't unclear whether most languages even let you distinguish between a key being present or with value null.

Also, I't unclear whether most languages even let you distinguish between a key being present or with value null.

I think that should not be language-dependent. JSON is a string; the implementation is responsible for correct parsing, and any language is able to implement stream parsing of JSON.

But I totally agree with rejecting only non-existing keys. And in fact, that's exactly what intuitively follows from Goessner's proposal:

$..book[?(@.isbn)] | filter all books with isbn number

glyn commented

However this leads to query $[?(@)] becoming completely pointless, because all elements in an array are defined.

I'm not concerned that some features allow pointless possibilities, unless they are likely to mislead users which I don't think is the case here.

I'm interested whether there really is a use case for distinguishing null and undefined. Examples:

$ node
> a = JSON.parse('{"key": null}')
{ key: null }
> a['key']
null
> a['key1']
undefined

$ python
>>> import json
>>> a = json.loads('{"key": null}')
>>> a["key"]
>>> a["key1"]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'key1'

However

$ irb
irb(main):001:0> require 'multi_json'
=> true
irb(main):002:0> a = MultiJson.decode('{"key": null}')
=> {"key"=>nil}
irb(main):003:0> a["key"]
=> nil
irb(main):004:0> a["key1"]
=> nil

I believe we will find more examples for either bucket.

$..book[?(@.isbn)] | filter all books with isbn number

I don't read anything specific from Goessner's proposal. He does not given an example on null. His implementation falls back to JavaScript's implementation of truthiness, so that's also not too much help here.

glyn commented

Distinguishing between a nil value and absence from a hash can be done in Ruby:

irb(main):001:0> require 'multi_json'
=> true
irb(main):004:0> a = MultiJson.decode('{"key": null}')
=> {"key"=>nil}
irb(main):005:0> a["key"]
=> nil
irb(main):006:0> a["key1"]
=> nil
irb(main):007:0> a.key?("key")
=> true
irb(main):008:0> a.key?("key1")
=> false

Distinguishing between a nil value and absence from a hash can be done in Ruby:

I've seen at least one implementation of JSONPath implement a specific function that checks for the existence of the key. I just don't remember where I saw this.

I took this approach in warpath, @.key means key exists at current node, there are a couple of functions supported that could be used to check the value of key, for example, is_nil, is_boolean etc.

bhmj commented

Empty array, object, string
false
null,
undefined key
0

I think the behaviour has to be as much intuitive as possible and having that in mind I'd treat the expression $[?(@.key)] as "@.key exists and @.key isn't nothing". Now, looking at the Christoph's list I would say

  • Empty array, object, string -- key exists and definitely isn't nothing because it does have a type => true
  • false -- that's the hard one. Key exists and has a type but the value is false => uncertain, see below
  • null -- key exists, but is nothing (does not even have a type not to mention the value) => false
  • undefined key -- key does not exist => definitely false
  • 0 -- key exists and isn't nothing (just a number) => true

So for me the only uncertain situation here is a false value, and the uncertainty comes from the fact that the result of ?(@.key) expression is also intuitively perceived as a simple boolean value even though in fact it may have more complex logic (like "exists and not nothing"). We could follow the majority of implementations in this particulair case or maybe even try to force to exclude this uncertainty by requiring an explicit use of comparison operator for booleans (like @.key==true) though I doubt it would be accepted )