FasterXML/jackson-core

JSON spec noncompliance: parsing multiple root objects

mpdn opened this issue · 3 comments

mpdn commented

Version: 2.17.2

Hi. I recently discovered that Jackson parsers allow multiple root objects.

Example:

val parser = new JsonFactory().createParser("[][]")
assert(parser.nextToken() eq JsonToken.START_ARRAY)
assert(parser.nextToken() eq JsonToken.END_ARRAY)
assert(parser.nextToken() eq JsonToken.START_ARRAY)
assert(parser.nextToken() eq JsonToken.END_ARRAY)
assert(parser.nextToken() eq null)

(Scala, but the equivalent Java should be clear)

This behavior bubbles up into the databind APIs where multiple root values are accepted as well, but anyone but the first is just ignored:

assert(new ObjectMapper().readValue("1 2", classOf[Int]) == 1)

This behavior is surprising to me, since the JSON spec disallows it. I looked to see if this was tied to a Jackson feature, but it does not seem like it either.

This kind of leniency can be dangerous - for example, a service accepting JSON could be passed NDJSON by accident, only read the first element, and respond with 200 OK even though the majority of the request was ignored.

So is this expected? I can see how this might be useful behavior in some situations, but I would have expected there to be a feature disabling it to follow the JSON spec accurately.

Yes, this is expected at level of Streaming parser.

But at databind (jackson-databind) level there is setting DeserializationFeature.FAIL_ON_TRAILING_TOKENS, enabling of which will fail reads where there is content beyond a single root-level JSON value.
I think this is the feature to enable if there is concern about NDJSON (and similar) being problematic.

mpdn commented

Alright, good to know this behavior can be toggled at that level. I am only using the parsing level however, so that puts the onus on me to check for this.

Thanks!

@mpdn yes, true. It is tricky to force validation at that level since JsonParser is to be iterated manually. But now that you know of this, you can just call nextToken() at the end and verify null is returned (for end-of-content) (or exception for unparseable trailing garbage)