dukeify/fake-jni

JString is broken (utf8 to utf16 conversion and vice versa)

ChristopherHX opened this issue · 4 comments

Expected behavior:

  • creating a JString with a UTF-8 string (char*)
  • get a string with getStringUTFChars returns a multibyte sequence
  • get the same string with getStringChars return a utf-16 sequence

Actual behaviour:

  • creating a JString with a UTF-8 string (char*)
  • get a string with getStringUTFChars returns a multibyte sequence
  • get the same string with getStringChars returns a multibyte sequence, now just casted to utf-16 type

Following code causes this behaviour
One of should convert the sequence, or remember it's codepage

I'm aware that jni spec is not using UTF-8, but UCS-8.

This issue also applies to the minecraft-linux fork, which I need to fix to get the xboxlive working again.

getStringChars should be nullterminated, because new microsofts xboxlive depends on this property to be true.
So Android's jvm have to do it the same way.
I'm unshure

  • create a JString((jchar*)u"1.16.20.03", 10)
  • getStringChars returns u"1.16.2" (non null terminated)
  • getStringLength returns 5, but length from ctr is 10
  • memcpy have to copy 2*getStringLength of bytes to copy a char16_t array
luser commented

The JNI docs are extremely unclear here but some brief research indicates that getStringChars does not guarantee that the string returned is null-terminated. getStringUTFChars does guarantee null-termination for the UCS-8 strings it returns.

luser commented

The implementation of JString here is very confusing! It inherits from JCharArray which is an array of jchar, so 16-bit code units. However, as pointed out in the original comment, the constructor that takes a char* uses memcpy to copy those bytes into the internal array, which seems wrong:

JString::JString(const char * str) : JString((JInt)strlen(str))
{
memcpy(getArray(), str, (size_t)length);
}

...and the constructor that takes JChar*, JInt size also uses memcpy, but uses size as the byte length of the string (it's not, it's the length of the string in 16-bit code units) and initializes the internal array to be 2*size in length, which is twice as large as it needs to be:

JString::JString(const JChar * str, JInt size) : JString(size * 2) {
memcpy(getArray(), (char *)str, (size_t)size);
}

Unfortunately the prototype version of fake-jni is no longer maintained, see the README. There is a new version on the way, which should serve as a near drop-in replacement for the prototype, that does not have any of the bugs described in the issue tracker.