pallets/flask

Allow using |tojson in double-quoted HTML attributes again

Closed this issue ยท 23 comments

I'm in a project where double-quoted HTML attributes is part of the style guide. We're also using data- attributes to pass data to code and consider it a bad practice to render anything directly within scripts.

Being able to remove |safe within <script> seems convenient at first glance. But after discovering that it changes the behavior of |tojson everywhere else when using double-quoted attributes, it doesn't seem as nice. It's just less intuitive overall. Forcing you to use a particular HTML style or requiring |forceescape (as opposed to just |e) isn't really a great tradeoff for implying |safe within <script>.

So the underlying question. Can we allow |tojson to work within double-quotes attributes again? Or is there some technical reason why we can't. And if not, why not add another filter to use within <script> blocks instead so the principal of least surprise isn't violated?

Can you give an example? afaik nothing should have changed at all.

Example

What I was doing was rendering the following with something like ids=[u"abc", u"xyz"].

<div class="my-container" data-ids="{{ ids|tojson }}">
    ...
</div>

Result

<div class="my-container" data-ids="["abc", "xyz"]">
    ...
</div>

It's also documented to behave this way as of 0.10.

versionchanged:: 0.10
This function's return value is now always safe for HTML usage, even if outside of script tags or if used in XHTML. This rule does not hold true when using this function in HTML attributes that are double quoted. Always single quote attributes if you use the |tojson filter. Alternatively use |tojson|forceescape.

Note that prior to 0.10, autoescape took care of escaping both single and double quotes in attributes. Now that htmlsafe_dumps escapes more aggressively, it's safe to use in <script>, and so it's now wrapped with Markup. However, it's not aggressive enough because double quotes don't get escaped. So any code like above, while valid up until now, breaks in 0.10.

Always single quote attributes if you use the |tojson filter. Alternatively use |tojson|forceescape

Doesn't this paragraph help you?

It's more of a workaround.

The underlying problem is a design issue. It's quirky and inconsistent that you can only use |tojson in single-quoted attributes.

Even more so since you need to patch it up with |forceescape when you use a different, otherwise-equivalent quoting style. I mean, there's no difference between single and double quoted attributes anywhere else in Flask, Python, HTML, or even JavaScript.

Design issue

Are you expecting Jinja to

1.) Find out if you're generating HTML
2.) Parse the HTML and see which kind of quotes you used around that {{ }} block

Jinja doesn't make that assumption. The fact that it doesn't is not a "design issue".

I'd rather it just behave the way it used to. Or have a designated filter for rendering JSON into JavaScript code (i.e. within <script> blocks, as opposed to rendering into HTML).

Before that breaking change, you had to use |tojson|safe in <script> blocks. After that change, |safe is implied. But it's not actually safe everywhere if you can't use it within certain kinds of attributes. That's the design flaw -- it's no longer consistent.

It's safe, it just doesn't work. Before that change using |tojson without |safe was a potential security problem. This is a tradeoff but it was decided that we are better of with the current behaviour.

It's now vulnerable to XSS without |forceescape when using standard double-quoted attributes.

Try replacing the value of ids in my example above with ' onmouseover=alert(1) '.

Here's a more complete Gist and its rendered output.

I'd like to work on this as part of the PyCon sprints. Do we want to allow double quotes or not for the tojson filter?

I'm not really sure if we want to go back after two releases. It would be nice to figure out a way to make the XSS case @joeyespo posted safe again, but catering to all sorts of styleguides while still being mostly secure is an impossible balancing act.

I'm fine with adding some documentation "don't put json in an html attribute, but if you do, use |forceescape". It seems a lot more natural to use JSON in JavaScript than an HTML attribute, so that should be the easy path.

No, the requirement is to always use single quotes.

On Mon, Jun 06, 2016 at 02:57:33PM -0700, David Lord wrote:

I'm fine with adding some documentation "don't put json in an html attribute, but if you do, use |forceescape".


You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
#1002 (comment)

Sure, that works too. Still seems awkward to try and shove json in an html attribute in the first place. Are we agreeing that tojson shouldn't change?

It's common when using data-* attributes to preload structured data onto the page without requiring additional AJAX calls. Particularly when mixing Flask with a front-end JS framework.

The problem is more about breaking the principle of least astonishment. Right now tojson works differently depending on what style quotation you use. Like Python, it shouldn't make any difference if your HTML style guide uses one or the other.

If shoving json in HTML attributes seems wildly awkward to the Flask community, why not favor consistency and make tojson break or complain in both places? Require |forceescape in single quotes too. That way there's no surprises when going between projects that favor a different quotation style, or when refactoring <div data-field='{{ field }}'> to <div data-field="{{ field }}"> (or vice versa).

Right now tojson works differently depending on what style quotation you use.

That is not true. The filter outputs the same thing in either case. That is like saying {{ foo|safe }} "works differently" based on quotation style.

If shoving json in HTML attributes seems wildly awkward to the Flask community

No, not really.

The filter outputs the same thing in either case.

True.

It's different behavior though. One quotation style breaks your front-end, the other doesn't. It's a dangerous side effect from how this is currently implemented.

cc @mitsuhiko since he implemented this behavior

I should probably reply to this but the current filter works this way because it works in two situations: within <script> tags as well as within single quoted attributes. There is no way to make it work in double quoted attributes without breaking the usage in <script> tags and others.

I should clarify that i close this because changing this behavior would break lots of code in very subtle and potentially dangerous ways.

I know it's closed, but I just want to add my +1 here in case there's a possibility of this getting fixed in a major update down the road. I'm not gonna comment on proper ways to fix the issue, but I will say this: The name and function of 'tojson' don't imply to me that a different escape mechanism will be used, so it's pretty astonishing to find that {{ '"quoted"' }} and {{ 'quoted'|tojson }} are not equivalent.

I'm ambivalent wrt this behavior but it takes much better arguments to change it now. Both usecases make sense.

However, I don't understand how reverting to the old behavior would break code in dangerous ways as claimed by @mitsuhiko

I'm way late for this but I must say it also doesn't make sense to me that tojson_filter calls a function called htmlsafe_dumps and returns an object of a type called Markup which has a __html__ attribute and it's somehow not supposed to be escaped for html, but for javascript, which isn't even markup.

@odraencoded It's escaped for use in inline script elements - i.e. JavaScript within HTML, which is markup. So there is some logic to it.