False positive for `foo="<span class="bar">baz</span>"`
Closed this issue · 4 comments
Input:
<span>
foo="<span class="bar">baz</span>"
</span>
Expected result (no changes):
<span>
foo="<span class="bar">baz</span>"
</span>
Actual result:
<span>
foo="<span class="bar">baz</span>"
</span>
The change is coming from this part:
anti-xss/src/voku/helper/AntiXSS.php
Lines 1543 to 1559 in 19da849
The part above is changing the code, but the cause seems to be this regex:
anti-xss/src/voku/helper/AntiXSS.php
Line 539 in 19da849
It detects foo="..."
as an attribute, but actually it is content of the outer span tag.
Hi, thanks for the bug report.
I already looked into it and the problem is that we use $regExForHtmlTags = '/<\p{L}+.*+/us';
to decode the HTML attributes, but because we want to decode also HTML attributes in broken tags, I thought it's a good idea to look only for the open tags.
If we replace it with $regExForHtmlTags = '/<\p{L}+.*>/usU';
, then it's working, but I need to check what this is changing for all the test cases.
EDIT: with this change e.g. <a title="\'<<>>">
is not detected anymore :-/
⇾ https://regex101.com/r/jzMmDh/1 vs. https://regex101.com/r/uxFLyA/1
Maybe something like https://regex101.com/r/loBzsU/1?
Thanks for the fix: :)
⇒ fixed in version 4.1.34