Wrong base64 encode of unicode chars
rjcoelho opened this issue · 1 comments
rjcoelho commented
Base64 encode/decode of unicode chars first tries to convert to utf8, right idea but wrong implementation.
(new Hashes.Base64()).encode('张')
"5Q=="
(new Hashes.Base64()).setUTF8(false).encode(unescape(encodeURIComponent('张')))
"5byg"
window.btoa(unescape(encodeURIComponent('张')))
"5byg"
Implementation wise utf8Encode() returns same as unescape(encodeURIComponent()) so the problem is elsewhere. See https://github.com/davidchambers/Base64.js/blob/master/base64.js for window.atob() shim. I think len shoud be after utf8 encode.
Also base 64 decode is wrong.
(new Hashes.Base64()).decode('5byg')
"o("
(new Hashes.Base64()).setUTF8(false).decode('5byg')
"o("
decodeURIComponent(escape(window.atob("5byg")))
"张"
h2non commented
I'll probably remove Base64 support in a future version since it's well-supported natively by JavaScript engines in the browser/node.js.
Alternatively you could use this implementation, which is more reliable:
https://gist.github.com/h2non/37a6d588271fc9c1e828