Invalid data is read in JSONGet when a struct with Unicode strings is passed into JSONSet.
Closed this issue · 1 comments
Describe the bug
JSONGet
does not produce the correct output for Cyrillic text.
To Reproduce
Steps to reproduce the behavior:
- Use
JSONSet
to write a structure withstring
fields. Fill the fields with Unicode characters. - Use
res, err := JSONGet()
to read the structure. - Convert
res
into[]byte
either manually (res.[]byte
), or usingredigo.Bytes(res)
. - Unmarshall the
[]byte
result viajson.Unmarshall()
into the structure. - Compare values you have written with values
json.Unmarshal
produced fromJSONGet
result. - New structure will contain fields with different (seemingly random) characters.
Expected behavior
Fields in first structure (which we have written) and the second one (which was read) should match.
Additional context
The problem I found lies within rjs.StringToBytes
function, which is called from JSONGet
. There are the following lines (_lst
is a string
, by
is []byte
) :
for _, s := range _lst {
by = append(by, byte(s))
}
Here, s
is a rune
, which is an alias for int32
. When we convert it into byte
, we loose all but the least significant byte. Fix is pretty straightforward, we just need to convert string
into []byte
directly, without looping over each rune
:
by = []byte(_lst)
I've copied JSONGet
in my own code and applied this fix, and my Unicode problem was solved.
I was having problems with Brazilian Portuguese accentuation and it worked!