Stranger6667/css-inline

Css inlining creates invalid XML with single tags

matthiaskoenig opened this issue · 3 comments

The CSS inlining is removing the closing characters of single tags. E.g.

  • <hr /> is replaced with <hr>
  • <img ... /> is replaced with <img ...>

This makes the library unusable in contexts were valid XML is required.
There is no need to remove the closing / and is most likely a bug in the transformation.

Thanks for the great library. This would solve my use cases if it would keep the HTML valid XML.

One can use lxml to fix the generated issues as a workaround

       # css inline (as a side effect removes closing parts of empty tags)
        html_inline = css_inline.inline(html, extra_css=css)
        from lxml import html, etree

        # closing single tags again
        doc = html.fromstring(html_inline)
        doc_bytes: bytes = etree.tostring(doc)
        html_inline = doc_bytes.decode(encoding="utf-8")

Hi, sorry for my late reply.

Indeed, the underlying machinery does not support XML at the moment, but there is a workaround we can use to emit proper XML, so maybe we can implement this properly. Not sure about the API though - maybe a separate argument to inline.

There is no need to remove the closing / and is most likely a bug in the transformation.

Indeed, html5ever doesn't preserve the closing /, which I assume is OK for HTML 5 serializer, but indeed <hr /> would be still a valid HTML but also a valid XHTML. It would be nice at least to have a config option for this there.

I added a note on this behavior to the README file and going to close this issue as I don't think there are any other action items.