Opened this issue 4 years ago · 1 comments
The InputSource.getByteStream() method returns a raw byte stream, and thus does not contain the byte encoding. Currently, code that uses InputSource.getByteStream() assumes the encoding is UTF-8, but this is not guaranteed.
InputSource.getByteStream()
UTF-8
One approach: https://lingohub.com/blog/2014/07/ensuring-proper-java-character-encoding-of-byte-streams