tvst/htbuilder

Escaping

pelme opened this issue · 4 comments

pelme commented

Thanks for a very interesting and useful library!

Currently, no escaping children or attributes of inputs is done. This makes the library unsafe for general purpose use IMO.

>>> print(div('<script>alert("danger!")</script>'))
<div><script>alert("danger!")</script></div>
>>> print(div(id='">hello'))
<div id="">hello"></div>

I think all input strings should be escaped by default (python has html.escape: https://docs.python.org/3/library/html.html

Django has a concept of safe strings where all inputs to templates are escaped by default. When you want to inject trusted HTML, you mark the string as safe with mark_safe:
https://docs.djangoproject.com/en/4.2/_modules/django/utils/safestring/

The convention is basically that an object with a __html__() will not be escaped.

Would you be open to add escaping or accept a PR that implements it?

tvst commented

Hey, sorry I totally missed this issue. I'd definitely be open to a PR!

The behavior you describe makes sense to me:

  1. div(my_str) should escape my_str

  2. div(obj_with_dunder_html) should show obj.__html__() without escaping

    And I'd also add:

  3. div(htbuilder_obj) should not escape.

One more thing: at first I thought we should implement a new escape = True | False | "auto" (default) argument, but I worry about the potential for collision with HTML attributes. LMK if you have a good idea how to support this while avoiding collisions.

Hi @tvst , I started playing with similar ideas in https://github.com/pelme/htpy. It escapes all strings by default but respects __html__() so it integrates nicely with Django/Jinja2 templates.

If you or someone else is interested in implementing this in htbuilder, here are the tests for htpy that could be useful:
https://github.com/pelme/htpy/blob/746c50e77a04d4d62e8d850d211064a778ad0c05/tests/test_safestring.py
https://github.com/pelme/htpy/blob/746c50e77a04d4d62e8d850d211064a778ad0c05/tests/test_django.py
https://github.com/pelme/htpy/blob/746c50e77a04d4d62e8d850d211064a778ad0c05/tests/test_jinja2.py

We have been using htpy at work the last couple of months. htpy also has conveniences for creating id/classes via a css-selector-like string, __getitem__/[] syntax for child elements to make children stand out from the attributes when reading the code. I looked at all similar python libs I could find and as you may notice it is heavily inspired by htbuilder. :)

I think the idea of generating html directly from python is amazing. We are all long time django template users and are very happy to use this instead. It works especially well in a typed code base.

tvst commented

Oooh, __getitem__ syntax is a great idea. And so is syntax sugar for id/classes.

Happy to either join forces or just deprecate htbuilder in favor of yours!

Would be great to join forces! Have not yet created proper docs/website, that is the main thing missing I think. If others are going to pick up the idea I think the docs needs a nice tutorial and concrete examples where this is useful instead of django/jinja templates.