html-to-dash
Convert HTML to dash format.
Installation
pip install html-to-dash
Examples
Basic usage
from html_to_dash import parse_html
element_str = """
<div>
<div class='bg-gray-800' style='color:red;margin:10px'>
<svg aria-label="Ripples. Logo" role="img" xmlns="http://www.w3.org/2000/svg"></svg>
<a href="#" id="link1">A</a>
</div>
<div>text</div>
<div><a href="#" id="link1">a1</a>tail1<a href="#" id="link2">a2</a>tail2</div>
</div>
"""
parse_html(element_str)
Print:
# Tags : Unsupported [svg] removed.
Result:
html.Div(
children=[
html.Div(
className="bg-gray-800",
style={"color": "red", "margin": "10px"},
children=[html.A(href="#", id="link1", children=["A"])],
),
html.Div(children=["text"]),
html.Div(
children=[
html.A(href="#", id="link1", children=["a1"]),
html.Span(children=["tail1"]),
html.A(href="#", id="link2", children=["a2"]),
html.Span(children=["tail2"]),
]
),
]
)
- By default, only tags in the dash.html module are supported.
- Tags and attributes are checked, and those that are not supported are automatically removed.
- The tags and attributes are case-insensitive.
- If the provided HTML string is unclosed, div will be automatically added as the root tag.
- The html, body, and head tags will be automatically removed without notification, as these tags may be automatically supplemented by the lxml module and are not supported in dash.
- The tail(Text after element's end tag, but before the next sibling element's start tag) will automatically be converted into the text of a span tag.
Enable dash_svg
Use dash-svg module to render SVG tags.
from html_to_dash import parse_html
element_str = """
<svg xmlns=" http://www.w3.org/2000/svg " version="1.1" width="300" height="300">
<rect x="100" y="100" width="100" height="100" fill="#e74c3c"></rect>
<polygon points="100,100 200,100 150,50" fill="#c0392b"></polygon>
<polygon points="200,100 200,200 250,150" fill="#f39c12"></polygon>
<polygon points="100,100 150,50 150,150 100,200" fill="#f1c40f"></polygon>
<polygon points="150,50 200,100 250,50 200,0" fill="#2ecc71"></polygon>
<polygon points="100,200 150,150 200,200 150,250" fill="#3498db"></polygon>
</svg>
"""
parse_html(element_str, enable_dash_svg=True)
Print:
Result:
dash_svg.Svg(
xmlns=" http://www.w3.org/2000/svg ",
version="1.1",
width="300",
height="300",
children=[
dash_svg.Rect(x="100", y="100", width="100", height="100", fill="#e74c3c"),
dash_svg.Polygon(points="100,100 200,100 150,50", fill="#c0392b"),
dash_svg.Polygon(points="200,100 200,200 250,150", fill="#f39c12"),
dash_svg.Polygon(points="100,100 150,50 150,150 100,200", fill="#f1c40f"),
dash_svg.Polygon(points="150,50 200,100 250,50 200,0", fill="#2ecc71"),
dash_svg.Polygon(points="100,200 150,150 200,200 150,250", fill="#3498db"),
],
)
- In the dash application,
import dash_svg
module will render normally. - The dash_svg has higher priority than dash.html, but lower priority than extra module.
Expanded usage
from html_to_dash import parse_html
element_str = """
<html>
<body>
<div>
<input type="text" id="username" name="username" aria-label="Enter your username" aria-required="true">
<div class='bg-gray-800' style='color:red;margin:10px'>
<a href="#" id="link1">A</a>
</div>
<div>text</div>
<svg></svg>
<script></script>
<div><a href="#" id="link2">B</a></div>
</div>
</body>
</html>
"""
extra_mod = [{"dcc": {"Input": ["id", "type", "placeholder", "aria-*"]}}]
def tag_attr_func(tag, items):
if tag == "Input":
k, v = items
if "-" in k:
return f'**{{"{k}": "{v}"}}'
parsed_ret = parse_html(
element_str,
tag_map={"svg": "img"},
skip_tags=['script'],
extra_mod=extra_mod,
tag_attr_func=tag_attr_func,
if_return=True,
)
print(parsed_ret)
Print:
# Tags : Unsupported [script] removed.
# Attrs: Unsupported [name] in dcc.Input removed.
html.Div(
children=[
dcc.Input(
type="text",
id="username",
**{"aria-label": "Enter your username"},
**{"aria-required": "true"}
),
html.Div(
className="bg-gray-800",
style={"color": "red", "margin": "10px"},
children=[html.A(href="#", id="link1", children=["A"])],
),
html.Div(children=["text"]),
html.Img(),
html.Div(children=[html.A(href="#", id="link2", children=["B"])]),
]
)
- The * sign is supported as a wildcard, like data-*, aria-*.
- Both class and className can be handled correctly.
- In fact, attributes with the "-" symbol are processed by default, which is only used here as an example. Similarly, the style attribute can be handled correctly.
- If tag_map param is provided, will convert the corresponding tag names in the HTML based on the dict content before formal processing.
- Tag in skip_tags will remove itself and its text.The priority of tag_map is higher than skip_tags.
- Supports any custom module, not limited to HTML and DCC. Essentially, it is the processing of strings.
- Custom module prioritize in order and above the default dash.html module.
- The tag_attr_func param is a function that handle attribute formatting under the tag.
When adding quotation marks within a string,double quotation marks
should be added to avoid the black module being unable to parse.
For example,f'**{{"{k}": "{v}"}}'
instead off"**{{'{k}': '{v}'}}"
、f'{k}="{v}"'
instead off"{k}='{v}'"
- If the HTML structure is huge, set huge_tree to True.