dolphinsmalltalk/DolphinVM

UnicodeStrings have single byte null terminator (should be two zero bytes)

Closed this issue · 1 comments

Dolphin 's VM has the concept of null-terminated byte objects which have an implicit null terminator. This is allocated and zero'd, but is not included in the reported size (although it is in the #basicSize). However, this is only ever a single byte. UnicodeString instances (which are really UTF16 strings) have two-byte elements, but the VM still only allocates a single byte null terminator. Intermittently this may cause a fault when reading from a UnicodeString because the second byte of the null terminator may still contain a byte from previous use of the memory that is not zero'd out, which may mean the final two bytes are interpreted
as a valid code point rather than a null terminator.

It is worth noting that there is no danger of memory corruption due to writing off the end of a UnicodeString that is one-byte too short. This is because the block required to represent the characters must always be of even length, plus one null byte making for an odd allocation. The VM always rounds up allocations to a multiple of 4 or 8 bytes, so it will always actually allocate at least two more bytes. For example lets say you have a 7 character UnicodeString. This requires 14 bytes for the characters, to which the VM adds 1 for the null terminator making 15 bytes. It then rounds this up to a multiple of 4, giving 16 bytes.

Fixed in VM 7.0.54