OWASP/java-html-sanitizer

Issue while disallowing attributes matching pattern

subbudvk opened this issue · 0 comments

I am trying to disallow attributes matching a specific pattern.

    ```
    HtmlPolicyBuilder builder = new HtmlPolicyBuilder();
 	 PolicyFactory factory = builder.allowUrlProtocols("http", "https").allowElements("img","a","div","span")		 
     .allowAttributes("alt", "src").onElements("img")
     .allowAttributes("border", "height", "width").onElements("img")
     .allowAttributes("href").matching(Pattern.compile(".*google.*")).onElements("a")
     .disallowAttributes("src").matching(Pattern.compile(".*google.*")).onElements("img")
     .toFactory();
	 System.out.println("ALLOW ATTRIBUTES :: " + factory.sanitize("<a href='http://google.com'>"));
	 System.out.println("DISALLOW ATTRIBUTES :: " + factory.sanitize("<img src='http://yahoo.com'>"));
    ```

Allow attributes matching a particular pattern alone works as expected.
Disallow attributes matching pattern "google" not working as expected and discards http://yahoo.com

If I am not wrong disallowAttribute() does a allowAttribute() matching a REJECT_ALL policy so no further matching can be called on the returned AttributeBuilder. Is my understanding correct? I understand the library is whitelist based and everything not allowed by default is rejected by default. But, in our case we ship a minimal policy and the consumer may still want to restrict few more entities. If my understanding above on why this doesn't work is right, is there a way to achieve it?