Example produces non valid JSON (single quotes)
mlliarm opened this issue · 1 comments
mlliarm commented
Hello,
I found today that using your library with python 3.9.15 and 3.10.xx that it produces a non recognizable JSON result.
The code I wrote:
# bad_test.py
import html_to_json
html_string = """
<head>
<title>Floyd Hightower's Projects</title>
<meta charset="UTF-8">
<meta name="description" content="Floyd Hightower's Projects">
<meta name="keywords" content="projects,fhightower,Floyd,Hightower">
</head>
"""
output_json = html_to_json.convert(html_string)
print(output_json)
Result:
{'head': [{'title': [{'_value': "Floyd Hightower's Projects"}], 'meta': [{'_attributes': {'charset': 'UTF-8'}}, {'_attributes': {'name': 'description', 'content': "Floyd Hightower's Projects"}}, {'_attributes': {'name': 'keywords', 'content': 'projects,fhightower,Floyd,Hightower'}}]}]}
The fix I found was to use json.dumps
on the resulting dict:
# good_test.py
import html_to_json, json
html_string = """
<head>
<title>Floyd Hightower's Projects</title>
<meta charset="UTF-8">
<meta name="description" content="Floyd Hightower's Projects">
<meta name="keywords" content="projects,fhightower,Floyd,Hightower">
</head>
"""
output_json = html_to_json.convert(html_string)
print(json.dumps(output_json))
Output:
{"head": [{"title": [{"_value": "Floyd Hightower's Projects"}], "meta": [{"_attributes": {"charset": "UTF-8"}}, {"_attributes": {"name": "description", "content": "Floyd Hightower's Projects"}}, {"_attributes": {"name": "keywords", "content": "projects,fhightower,Floyd,Hightower"}}]}]}
Result of "python good_test.py | prettyjson":
{
"head": [
{
"title": [
{
"_value": "Floyd Hightower's Projects"
}
],
"meta": [
{
"_attributes": {
"charset": "UTF-8"
}
},
{
"_attributes": {
"name": "description",
"content": "Floyd Hightower's Projects"
}
},
{
"_attributes": {
"name": "keywords",
"content": "projects,fhightower,Floyd,Hightower"
}
}
]
}
]
}
If you agree I can make a fix in the result of the html_to_json.convert
func and send a PR.
fhightower commented
Thanks for reporting this! Makes sense - I'll happily accept this as a PR 😄 .