apostrophecms/sanitize-html

sanitize-html not acknowledging allowedSchemes options

asrv4git opened this issue · 1 comments

sanitize-html not acknowledging allowedSchemes options

To Reproduce

Step-by-step instructions to reproduce the behavior:
Use 2.13.1 version of sanitize-html
Run below code

var sanitizeHtml = require("sanitize-html");

const ALLOWED_SCHEMES = ['http', 'https'];

const htmlStr = `\'"><meta http-equiv="refresh" content="0;url=file:///etc/passwd" />`;

const cleanedHTML = sanitizeHtml(htmlStr, {
    allowedAttributes: false,
    allowedTags: false,
    allowVulnerableTags: true,
    allowedSchemes: ALLOWED_SCHEMES,
    allowProtocolRelative: false,
    disallowedTagsMode: 'completelyDiscard',
    allowedSchemesByTag: {
        img: [...ALLOWED_SCHEMES, 'data']
    },
});

console.log(cleanedHTML);

Actual behavior

'"&gt;<meta http-equiv="refresh" content="0;url=file:///etc/passwd" />

Expected behavior

'"&gt;<meta http-equiv="refresh" content="0" />

Describe the bug

Even though I have configured to allow only 'http' and 'https' schemes, 'file' scheme is getting allowed in content="0;url=file:///etc/passwd attribute

Details

Version of Node.js: 18 LTS
PLEASE NOTE: Only stable LTS versions (10.x and 12.x) are fully supported but we will do our best with newer versions.

Server Operating System:
Linux and yes, Docker is involved?

The "content" attribute of the meta tag, in the presence of an http-equiv="refresh" attribute, doesn't take just a URL, it takes a combination of a timeout, a semicolon and a URL. sanitize-html has no special logic for validating this attribute. It is unlikely that we would add it because it would be quite unusual to allow this attribute because it can be used to redirect the user literally anywhere on the Internet, even if we don't allow "file" — in most cases this would not be desirable or safe behavior.

However, if you choose to allow these attributes, you can sanitize them your own way using the transformTags option. Check out that option in the documentation.

That being said: I also don't see where you allowed the content and http-equiv attributes at all, so I think there could be more going wrong here, but your code was not escaped properly by github so it is hard to say. If you open a "code block" in a github comment using three backticks on one line, paste your code on the following lines, and then do another line with three backticks, you should get a proper code block that escapes your code so I can read it fully.