OWASP/owasp-java-encoder

Encoding Supplementary character creates an issue

sudarshannavada opened this issue · 2 comments

encoder.encodeForHtml of Japanese character “𠮷”( 𠮷 ) resulting in �� and these code points are not identified by HTML document.
The browser doesn't understands the surrogate pairs.
We are using org.owasp.esapi esapi2.1.0.1
and
ESAPI.Encoder=org.owasp.esapi.reference.DefaultEncoder

Any leads will be appreciated.

Sorry for the mistake, I have tried org.owasp.encoder.Encode.forHtmlContent(String input) as Javadoc says Surrogate pairs are passed through if valid.
Anyway I got the solution.. !
Thank You.