Unicode character escapes are encoded again by Encoder.forHtml()
indra2gurjar opened this issue · 2 comments
indra2gurjar commented
if the input string contains unicode escaped character e.g. ✅
the output is "& amp ;#9989;"
the '&' character is encoded again.
Does Encoder support unicode escaped characters and it is a bug or this is not supported?
jmanico commented
The encoder is meant, on purpose, to encode all dangerous characters like you are describing. This is not the right tool for you.
If you have HTML entities that you wish to preserve then your input is HTML. Consider using the OWASP HTML Sanitizer instead.
Aloha,
--
Jim Manico
@manicode
… On May 28, 2018, at 4:50 AM, Indra Kumar Gurjar ***@***.***> wrote:
if the input string contains unicode escaped character e.g. ✅
the output is ✅
the '&' character is encoded again.
Does Encoder support unicode escaped characters and it is a bug or this is not supported?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
jmanico commented
This is not something we can fix, it's about proper use of the library.