JSONPath backend comparison
gavv opened this issue · 8 comments
This is a bit unusual issue, it's not about coding, but about doing research.
We're currently using yalp/jsonpath as JSONPath backend. It works well, however the author stated in README that it's experimental and users should switch to another engines.
Here are some other engines that I've found:
- https://github.com/PaesslerAG/jsonpath - not actively maintained
- https://github.com/oliveagle/jsonpath - not actively maintained
- https://github.com/ohler55/ojg - looks promising
ojg
looks promising, but to proceed, we need to compare it with yalp/jsonpath
.
Ideally, we need two things:
- a table of JSONPAth syntax features that summarizes what is supported and not by
yalp/jsonpath
andojg
- two test sets: one is a list of queries that work exactly the same for both parsers; one is a list of queries that work differently (e.g. one of the parser fails or maybe they return different results)
After we'll have this, we can decide whether it's possible to switch without breaking compatibility, and if not, we'll be able to provide migration guide for users.
We already have a test that covers many queries, though likely not all of them: https://github.com/gavv/httpexpect/blob/master/value_test.go#L441
Previous discussion: #49
Useful materials:
Hi,
I'd really love to see ojg json paths in httpexprect, because the support for subexpressions and filters is quite handy.
So I ran the test suite of yalp/jsonpath agains ojg, here's my quick take :)
Returned data types:
- yalp only returns slices when the path expression includes recursive searches or wildcards. With ojg you have to choose between Get(...) (always a slice) or First(...)
- however, if nothing is found ojg returns
nil
instead of an empty slice - If you reference an element that does not exists, yalp returns an error and ojg an empty slice
Supported syntax (as mentioned ojg supports a lot more here, here's just what's missing):
- ojg doesn't support bracket syntax without quotes (
$[A]
, but that's not the consensus anyway)
Behaviour:
- ojg interprets negative steps with open intervals differently from yalp and most other libraries (e.g.
$.A[::-1]
always produces an empty result in ojg) - yalp does not go with the consensus about the order of recursive descents (so the results are most of the time roughly backwards)
- ojg can make sense of those queries while yalp (and me) can't:
$..1
,$.1
,$..
,.A
Hope this helps
By the way... here's the hacky code I used: https://github.com/stubents/yalp-vs-ojg/blob/main/yalp_vs_ojg_test.go
And that's the table-like output it produces:
Running tests on:
{
"expensive": 10,
"store": {
"bicycle": {
"color": "red",
"price": 19.95
},
"book": [
{
"author": "Nigel Rees",
"category": "reference",
"price": 8.95,
"title": "Sayings of the Century"
},
{
"author": "Evelyn Waugh",
"category": "fiction",
"price": 12.99,
"title": "Sword of Honour"
},
{
"author": "Herman Melville",
"category": "fiction",
"isbn": "0-553-21311-3",
"price": 8.99,
"title": "Moby Dick"
},
{
"author": "J. R. R. Tolkien",
"category": "fiction",
"isbn": "0-395-19395-8",
"price": 22.99,
"title": "The Lord of the Rings"
}
]
}
}
Works the same way: $.store.book[*].author ojg: [Nigel Rees Evelyn Waugh Herman Melville J. R. R. Tolkien] in yalp: [Nigel Rees Evelyn Waugh Herman Melville J. R. R. Tolkien]
Works the same way: $..author ojg: [Nigel Rees Evelyn Waugh Herman Melville J. R. R. Tolkien] in yalp: [Nigel Rees Evelyn Waugh Herman Melville J. R. R. Tolkien]
Running tests on:
{
"A": [
"string",
23.3,
3,
true,
false,
null
],
"B": "value",
"C": 3.14,
"D": {
"C": 3.1415,
"V": [
"string2a",
"string2b",
{
"C": 3.141592
}
]
},
"E": {
"A": [
"string3"
],
"D": {
"V": {
"C": 3.14159265
}
}
},
"F": {
"V": [
"string4a",
"string4b",
{
"CC": 3.1415926535
},
{
"CC": "hello"
},
[
"string5a",
"string5b"
],
[
"string6a",
"string6b"
]
]
}
}
Works the same way: $.A ojg: [string 23.3 3 true false <nil>] in yalp: [string 23.3 3 true false <nil>]
Works the same way: $.A[*] ojg: [string 23.3 3 true false <nil>] in yalp: [string 23.3 3 true false <nil>]
Works the same way: $.A.* ojg: [string 23.3 3 true false <nil>] in yalp: [string 23.3 3 true false <nil>]
Works the same way: $.A.*.a ojg: [] in yalp: []
Works the same way: $ ojg: map[A:[string 23.3 3 true false <nil>] B:value C:3.14 D:map[C:3.1415 V:[string2a string2b map[C:3.141592]]] E:map[A:[string3] D:map[V:map[C:3.14159265]]] F:map[V:[string4a string4b map[CC:3.1415926535] map[CC:hello] [string5a string5b] [string6a string6b]]]] in yalp: map[A:[string 23.3 3 true false <nil>] B:value C:3.14 D:map[C:3.1415 V:[string2a string2b map[C:3.141592]]] E:map[A:[string3] D:map[V:map[C:3.14159265]]] F:map[V:[string4a string4b map[CC:3.1415926535] map[CC:hello] [string5a string5b] [string6a string6b]]]]
Works the same way: $.A[0] ojg: string in yalp: string
Works the same way: $["A"][0] ojg: string in yalp: string
Works the same way: $.A[1:4] ojg: [23.3 3 true] in yalp: [23.3 3 true]
Works the same way: $.A[:-1] ojg: [string 23.3 3 true false] in yalp: [string 23.3 3 true false]
Works the same way: $.F.V[4:5][0,1] ojg: [string5a string5b] in yalp: [string5a string5b]
Works the same way: $.A[-2:] ojg: [false <nil>] in yalp: [false <nil>]
Works the same way: $.F.V[4:6] ojg: [[string5a string5b] [string6a string6b]] in yalp: [[string5a string5b] [string6a string6b]]
Works the same way: $.A[1,4,2] ojg: [23.3 false 3] in yalp: [23.3 false 3]
Works the same way: $["B","C"] ojg: [value 3.14] in yalp: [value 3.14]
Works the same way: $["C","B"] ojg: [3.14 value] in yalp: [3.14 value]
Works the same way: $.F.V[4,5][0:2] ojg: [string5a string5b string6a string6b] in yalp: [string5a string5b string6a string6b]
Works the same way: $.A[::2] ojg: [string 3 false] in yalp: [string 3 false]
Works differntly: $.A[::-1] ojg: [] in yalp: [<nil> false true 3 23.3 string]
Works the same way: $.F.V[4:6][1] ojg: [string5b string6b] in yalp: [string5b string6b]
Works the same way: $.F.V[4:6][0,1] ojg: [string5a string5b string6a string6b] in yalp: [string5a string5b string6a string6b]
Not supported in ojg: $[A][0] (parse error at 3 in $[A][0])
Works the same way: $["A"][0] ojg: string in yalp: string
Not supported in ojg: $[B,C] (parse error at 3 in $[B,C])
Works the same way: $["B","C"] ojg: [value 3.14] in yalp: [value 3.14]
Works the same way: $..V[2,3].CC ojg: [3.1415926535 hello] in yalp: [3.1415926535 hello]
Works differntly: $..["C"] ojg: [3.14159265 3.141592 3.1415 3.14] in yalp: [3.14 3.1415 3.141592 3.14159265]
Works the same way: $.A..* ojg: [string 23.3 3 true false <nil>] in yalp: [string 23.3 3 true false <nil>]
Works the same way: $.A.* ojg: [string 23.3 3 true false <nil>] in yalp: [string 23.3 3 true false <nil>]
Works differntly: $..A[0] ojg: [string3 string] in yalp: [string string3]
Works differntly: $.*.V[0,1] ojg: [string4a string4b string2a string2b] in yalp: [string2a string2b string4a string4b]
Works the same way: $.*.V[2].C ojg: [3.141592] in yalp: [3.141592]
Works the same way: $.*.V[2:3].* ojg: [3.141592 3.1415926535] in yalp: [3.141592 3.1415926535]
Works differntly: $..A..* ojg: [string3 string 23.3 3 true false <nil>] in yalp: [string 23.3 3 true false <nil> string3]
Works differntly: $..V[*].* ojg: [3.1415926535 hello string5a string5b string6a string6b 3.141592] in yalp: [3.141592 3.1415926535 hello string5a string5b string6a string6b]
Works the same way: $.D.*..C ojg: [3.141592] in yalp: [3.141592]
Works differntly: $.D.V..* ojg: [3.141592 string2a string2b map[C:3.141592]] in yalp: [string2a string2b map[C:3.141592] 3.141592]
Works the same way: $.*.V[2:4].* ojg: [3.141592 3.1415926535 hello] in yalp: [3.141592 3.1415926535 hello]
Works the same way: $..V[2:4].CC ojg: [3.1415926535 hello] in yalp: [3.1415926535 hello]
Works differntly: $..[0] ojg: [string5a string6a string4a string3 string2a string] in yalp: [string string2a string3 string4a string5a string6a]
Works the same way: $.D.V.*.C ojg: [3.141592] in yalp: [3.141592]
Works the same way: $.*.V..C ojg: [3.141592] in yalp: [3.141592]
Works differntly: $..D..V..C ojg: [3.14159265 3.141592] in yalp: [3.141592 3.14159265]
Works differntly: $..A ojg: [[string3] [string 23.3 3 true false <nil>]] in yalp: [[string 23.3 3 true false <nil>] [string3]]
Works the same way: $..V[*].C ojg: [3.141592] in yalp: [3.141592]
Works the same way: $.D.V..C ojg: [3.141592] in yalp: [3.141592]
Works the same way: $.*.D.V.C ojg: [3.14159265] in yalp: [3.14159265]
Works the same way: $.*.D.V..* ojg: [3.14159265] in yalp: [3.14159265]
Works differntly: $.*.*.*.C ojg: [3.14159265 3.141592] in yalp: [3.141592 3.14159265]
Works differntly: $.*.V[0:2] ojg: [string4a string4b string2a string2b] in yalp: [string2a string2b string4a string4b]
Works the same way: $..V[2].C ojg: [3.141592] in yalp: [3.141592]
Works the same way: $.*.V[2].* ojg: [3.141592 3.1415926535] in yalp: [3.141592 3.1415926535]
Works differntly: $..C ojg: [3.14159265 3.141592 3.1415 3.14] in yalp: [3.14 3.1415 3.141592 3.14159265]
Works differntly: $..V..C ojg: [3.14159265 3.141592] in yalp: [3.141592 3.14159265]
Works differntly: $..A[0,1] ojg: [string3 string 23.3] in yalp: [string 23.3]
Works the same way: $.*.V[0] ojg: [string2a string4a] in yalp: [string2a string4a]
Works differntly: $.*.V[1] ojg: [string4b string2b] in yalp: [string2b string4b]
Works the same way: $..ZZ ojg: [] in yalp: []
Works the same way: $.D.V..*.C ojg: [3.141592] in yalp: [3.141592]
Works the same way: $.*.D..C ojg: [3.14159265] in yalp: [3.14159265]
Both produce error: $.A*]
Both produce error: $[C:B]
Both produce error: $.A[1:4:0:0]
No Error with ojg: $.1
Both produce error: $[A][0
Both produce error: $["]
No Error with ojg: $..
Both produce error: $[B,C
Both produce error: $.A[1,4.2]
Both produce error: $.
Both produce error: $.A[]
Both produce error: $.*V
Both produce error: $.A[:,]
No Error with ojg: $..1
No Error with ojg: .A
No Error with ojg: $.ZZZ
Thanks a lot for looking into this. This is very helpful.
I think ojg would be a very good addition. Seems that differences are big enough to add it as a separate method instead of replacing existing one. Currently we have Path(). We can mark it deprecated, add new method, say, Query(), and provide migration guide. Since migration can be quite painful, I think despite deprecation Path() will be never deleted actually.
In addition, I think we should make migration as smooth as we can.
yalp only returns slices when the path expression includes recursive searches or wildcards. With ojg you have to choose between Get(...) (always a slice) or First(...)
Ideally, we need to handle it somehow. I think if Query() will unconditionally return Array/slice value, it would be both inconvenient and complicate migration.
Do you think we can reliably implement a trick like you're doing here, but by inspecting parsed jp.Expr
? It seems that Expr
is slice of Frag
interfaces, and implementations of Frag
are exported types, so probably we can iterate over fragments and check their underlying types?
however, if nothing is found ojg returns nil instead of an empty slice
I guess we can detect nil and make the behavior similar to current behavior of Path().
If you reference an element that does not exists, yalp returns an error and ojg an empty slice
Not a big deal, because if httpexpect get error from yalp, it fails the test; it means that users likely don't have tests with such queries in their tests, so migration won't break anything here.
Also, the new behavior makes more sense, because it becomes possible to assert lack of certain element.
ojg doesn't support bracket syntax without quotes ($[A], but that's not the consensus anyway)
ojg interprets negative steps with open intervals differently from yalp and most other libraries (e.g. $.A[::-1] always produces an empty result in ojg)
So these two things are the major breaking points, especially the first one, and it seems we can't do anything with it except documenting in migration guide.
yalp does not go with the consensus about the order of recursive descents (so the results are most of the time roughly backwards)
This is another breaking point, but not so important, because hopefully not much tests will be tied to specific order, especially given that the existing order is strange.
ojg can make sense of those queries while yalp (and me) can't:
$..1, $ .1, $.., .A
Hopefully other users can't too :)
So here are the steps I see for this task:
- see if we can automatically detect single-value vs multi-value query
- see if we can automatically detect case when nothing found
- implement new method Query() that will do two detections above to behave more similar to Path()
- upgrade test suite based on code you provided:
- if needed, add more tests to cover all important queries (e.g. I guess error cases are missing?)
- extend tests so that each query tests both Path() and Query(), and provide alternative results for them when needed
- write migration guide (can be section in README or comment to Query): list what is now unsupported, what changed, what's new features became available; and add link to our test suite
- deprecate Path() and recommend to use Query() instead (I think we'll start from "soft" deprecation, i.e. we'll say it in comment, but won't add special "Deprecated:" comment that will trigger warnings; we'll probably add it in next major release)
- update examples and README
Ideally, we need to handle it somehow. I think if Query() will unconditionally return Array/slice value, it would be both inconvenient and complicate migration.
Do you think we can reliably implement a trick like you're doing here, but by inspecting parsed
jp.Expr
? It seems thatExpr
is slice ofFrag
interfaces, and implementations ofFrag
are exported types, so probably we can iterate over fragments and check their underlying types?
That hack from the test won't work for many edge cases like $.A["*"]
, inspecting the parsed Expr sound more promising. However, what do you think about creating two methods like Query()
and QueryAll()
?
This would give the user more control and might protect httpexpect from having to handle unexpected edge cases.
I guess we can detect nil and make the behavior similar to current behavior of Path().
maybe it's okay to keep the nil. If you replaced it with an empty slice you cannot distinguish between finding an empty array and not finding anything. I think this might be a downside of the current yalp implementation (e.g. $..ZZ
: I think ojg would actually return a nil there. It's just listed as []
above because of one of those hacks that make the results comparable. If there would be something in the json like {"ZZ": []}
, ojg returns something different but yalp still the same).
@stubents Thanks for your help and let me know if you wish to work on any of these.
Sure, I can give it a try, I just can't promise when I'll find time for it. Anyway, since this issue isn't very recent, I guess it won't be time critical :)
Ah.. no yalp would return [[]]
, I guess :)
Not sure what I like better
Hi @gavv
I gave it a try: #446
I changed my mind about two methods and tried to implement it just as you described in #236 (comment)
If you like the PR so far, I'd add the migration guide and the rest