The only function implemented in this module is parse
.
parse
is similar to json.loads. However, it corrects some mistakes that LLM can make.
It can be installed with pip install sick-json
and used with import sick_json;sick_json.parse(something)
.
This module interpret
some description here..
[{
"name": "Kim", // required
'age': 13, /* optional */
},
{
name: "Lee",
age: "\14",
]
to
[{"name": "Kim", "age": 13}, {"name": "Lee", "age": "14"}]
It can also be validated/calibrated via pydantic.
import sick_json
from pydantic import BaseModel
class Student(BaseModel):
name: str
age: int
sick_json.parse(r"""
some description here..
[{
"name": "Kim", // required
'age': 13, /* optional */
},
{
name: "Lee",
age: "\14",
]
""", pydantic_model=list[Student])
[{'name': 'Kim', 'age': 13}, {'name': 'Lee', 'age': 14}]
By default, it uses a JSON5 parser, which solves the following problems.
- Identifiers without quotes
{
name: "Kim"
}
- Trailing comma
{
"name": "Kim",
}
- JS-style comment
{
"name": "Kim" // something
}
The JSON5x parser, a slight variation of the JSON5 parser, parses the following syntax.
- Allow "True" and "False"
{
"name": "Kim",
"is_good_guy": True
}
- Unclosed brackets
{"name": "Kim",
[[{"some":{"name": "Kim"], "thing"]
- Ignore invalid escape sequence
{"name": "Kim", "age": "\18"}
Additionally, it heuristically solves a few problems.
- Verbose before and after JSON
blah blah blah
{
"name": "Kim"
}
blah blah blah
- If you have multiple JSONs, it will return the longest by default, or you can specify a pydantic format.
import sick_json
from pydantic import BaseModel
maybe_json = 'blah{"name": "Kim"}blah{"names": ["Kim", "Lee"]}blah'
sick_json.parse(maybe_json)
# it return {"names": ["Kim", "Lee"]}
class MyModel(BaseModel):
name: str
sick_json.parse(maybe_json, pydantic_model=MyModel)
# it return {"name": "Kim"}
- Type-correcting by pydantic model
import sick_json
from pydantic import BaseModel
maybe_json = '{"name": "Kim", "age": "13"}'
class MyModel(BaseModel):
name: str
age: int
sick_json.parse(maybe_json, pydantic_model=MyModel)
# it return {"name": "Kim", "age": 13}