zserge/jsmn

How to efficiently evaluate the number of json tokens

mm-longcheng opened this issue · 3 comments

I Know passing NULL instead of tokens will return the number of tokens needed.
But it is equivalent to doing parse twice.

pt300 commented

There practically isn't really a good way to get this number without parsing. I would just suggest parsing with some set length of tokens array and expanding it if JSMN runs out of tokens.

I noticed that the tokens in the parsing process must be a continuous array, so once the expanding occurs, is it must to copy the token data?

There seems to be an interface including GetToken and AddToken. Managed by externally provided tokens. Maybe make the interface more flexible.

I have a maybe wrong idea:
Define JSMN_PARENT_LINKS, sizeof(jsmntok) = 20, not suitable for 1k alignment. This seems a little unfriendly. Is it possible to make type and size four bytes, aligning the structure. The type actually has only 5 3bits, so the size has 2^21=2097152, is that enough?

I know this is old, but you could in theory add a “walker” that does the lexical analysis and counts tokens without caring about labeling them, allocating memory… it would be faster and more efficient as all it needs to hold is an unsigned integer for the count (potentially a signed one to signal an error if the json is invalid)