JString is broken (utf8 to utf16 conversion and vice versa)
ChristopherHX opened this issue · 4 comments
Expected behavior:
- creating a JString with a UTF-8 string (char*)
- get a string with getStringUTFChars returns a multibyte sequence
- get the same string with getStringChars return a utf-16 sequence
Actual behaviour:
- creating a JString with a UTF-8 string (char*)
- get a string with getStringUTFChars returns a multibyte sequence
- get the same string with getStringChars returns a multibyte sequence, now just casted to utf-16 type
Following code causes this behaviour
One of should convert the sequence, or remember it's codepage
-
fake-jni/src/jni/native/string.cpp
Line 27 in 3be5c1c
fake-jni/src/jni/native/string.cpp
Line 53 in 3be5c1c
-
Line 24 in 3be5c1c
Line 28 in 3be5c1c
I'm aware that jni spec is not using UTF-8, but UCS-8.
This issue also applies to the minecraft-linux fork, which I need to fix to get the xboxlive working again.
getStringChars should be nullterminated, because new microsofts xboxlive depends on this property to be true. I'm unshure
So Android's jvm have to do it the same way.
- create a JString((jchar*)u"1.16.20.03", 10)
- getStringChars returns u"1.16.2" (non null terminated)
- getStringLength returns 5, but length from ctr is 10
- memcpy have to copy 2*getStringLength of bytes to copy a char16_t array
The JNI docs are extremely unclear here but some brief research indicates that getStringChars
does not guarantee that the string returned is null-terminated. getStringUTFChars
does guarantee null-termination for the UCS-8 strings it returns.
The implementation of JString
here is very confusing! It inherits from JCharArray
which is an array of jchar
, so 16-bit code units. However, as pointed out in the original comment, the constructor that takes a char*
uses memcpy
to copy those bytes into the internal array, which seems wrong:
Lines 22 to 25 in 3be5c1c
...and the constructor that takes JChar*, JInt size
also uses memcpy
, but uses size
as the byte length of the string (it's not, it's the length of the string in 16-bit code units) and initializes the internal array to be 2*size
in length, which is twice as large as it needs to be:
Lines 27 to 29 in 3be5c1c
Unfortunately the prototype version of fake-jni is no longer maintained, see the README. There is a new version on the way, which should serve as a near drop-in replacement for the prototype, that does not have any of the bugs described in the issue tracker.