thesephist/zone

Sanitize notes by removing <script>, <style>, and <link> before markdown conversion

Closed this issue · 2 comments

Since the Markdown renderer passes HTML <script> and <style> elements straight through, it's possible to make a note look like anything. The application that occurred to me is "host a phishing page on a domain that doesn't trace back to you", so I made https://linus.zone/dtn-ap as a proof of concept.

Hm, touche. I do think there's value in being able to embed certain literal HTML, like <iframe>. So would sanitizing <script>, <style> and <link> suffice? Or whitelist allowed HTML tags?

Currently, the user input is basically piped directly into marked (the Markdown-to-HTML renderer). So the app would have to sanitize those tags out before sending the text to the renderer. Marked does have an extensible rendering algorithm but since this is a pretty light layer of sanitization I'd say doing it as a separate step makes more sense.

Stripping <script>, <style>, and <link> would make it harder to create a lookalike by copy/pasting an existing site, but there's not really a good way around the trick of getting a full-viewport canvas with <div style="position:absolute;top:0;left:0;width:100vw;height:100vh;overflow:scroll;background:white;">.

I think big sites like Reddit and Github take the approach of escaping all angle brackets before Markdown rendering, effectively creating a whitelist of allowed formatting that consists of "formatting that Markdown has a non-HTML representation for". For your site, I feel like that's a little harsh when there hasn't yet been any evidence of people trying to abuse it in earnest. Stripping anything that looks like an unescaped strip/style/link is probably enough protection for now.