JSON parsing example fails with string escape chars and numbers
Closed this issue · 1 comments
The JSON parser example has two issues with it that I noticed in the process of making my own gleam library that uses pears
for json parsing. The first issue is with the json string parser, which currently starts with:
let str =
none_of(["\""])
|> alt(escape)
This never runs the escape
parser, because none_of(["\""])
will match backslashes, and escape
only matches things starting with backslashes. The correct code should be:
let str =
none_of(["\"", "\\"])
|> alt(escape)
The second issue is with the num parser. It doesn't parse all valid json numbers, and will crash under some circumstances due to the use of let assert
. The incorrect assumption was that int.parse
will work on anything that wasn't parsed by float.parse
, when that isn't the case. JSON numbers have a whole number component, an optional decimal component, and an optional exponent. This exponent can be present even if the decimal component is not, making 7e4
a valid json number, but neither float.parse
or int.parse
will parse a number like this, as float.parse
requires a decimal component and an optional exponent, while int.parse
cannot accept a decimal component or exponent. Another issue is that json numbers can be arbitrarily large (or small for negatives), but float.parse
will fail for numbers over/under the size of a maximum/minimum double, so handling for that is also needed (in my project I set the result to be the max/min double when it was exceeded). A fixed version of this parser that I used in my project was the following:
let num =
maybe(just("-"))
|> pair(
alt(
to(just("0"), ["0"]),
recognize(pair(
one_of(["1", "2", "3", "4", "5", "6", "7", "8", "9"]),
many0(digit()),
)),
)
|> map(string.concat),
)
|> pair(maybe(
just(".")
|> right(many1(digit()))
|> map(string.concat),
))
|> pair(
recognize(maybe(
alt(just("e"), just("E"))
|> pair(maybe(one_of(["+", "-"])))
|> pair(many1(digit())),
))
|> map(string.concat),
)
|> map(fn(p) {
case p {
#(#(#(neg, ns), ds), ex) -> {
{
option.unwrap(neg, "") <> ns <> "." <> option.unwrap(ds, "0") <> ex
}
|> float.parse
|> result.unwrap(case neg {
Some(_) -> -1.7976931348623158e308
None -> 1.7976931348623158e308
})
|> Num
}
}
})
This version works by basically inserting a .0 as the decimal component of a number if there was none provided, such that float.parse
will always work (with the exception of too large/small numbers which are given fallback values).
For a full fixed version feel free to check out the one I used in my gleam library.
Thanks for opening the issue!
I kind of forgot that my example JSON parser was not complete, I'll definitely fix it and have at look at your code.