Sanitize notes by removing <script>, <style>, and <link> before markdown conversion
Closed this issue · 2 comments
Since the Markdown renderer passes HTML <script>
and <style>
elements straight through, it's possible to make a note look like anything. The application that occurred to me is "host a phishing page on a domain that doesn't trace back to you", so I made https://linus.zone/dtn-ap as a proof of concept.
Hm, touche. I do think there's value in being able to embed certain literal HTML, like <iframe>
. So would sanitizing <script>
, <style>
and <link>
suffice? Or whitelist allowed HTML tags?
Currently, the user input is basically piped directly into marked
(the Markdown-to-HTML renderer). So the app would have to sanitize those tags out before sending the text to the renderer. Marked does have an extensible rendering algorithm but since this is a pretty light layer of sanitization I'd say doing it as a separate step makes more sense.
Stripping <script>
, <style>
, and <link>
would make it harder to create a lookalike by copy/pasting an existing site, but there's not really a good way around the trick of getting a full-viewport canvas with <div style="position:absolute;top:0;left:0;width:100vw;height:100vh;overflow:scroll;background:white;">
.
I think big sites like Reddit and Github take the approach of escaping all angle brackets before Markdown rendering, effectively creating a whitelist of allowed formatting that consists of "formatting that Markdown has a non-HTML representation for". For your site, I feel like that's a little harsh when there hasn't yet been any evidence of people trying to abuse it in earnest. Stripping anything that looks like an unescaped strip/style/link is probably enough protection for now.