pawn-lang/sa-mp-fixes

Bugs in string functions

Daniel-Cortez opened this issue · 5 comments

Hello.

I recently found several bugs in string functions and fixed them in my Pawn fork, pawn-3.2-plus, thought the info about them would be useful here as well.

ispacked()

Returns invalid value (false) when a packed string starts with a symbol with code 128 and above (e.g. a ciryllic symbol).
Example:

static const str[] = !"абв";
for (new i = 0; i < sizeof(str) * (cellbits / charbits); ++i)
    printf("str{%d} = %02x", i, str{i});
printf("ispacked(str) = %d", ispacked(str));

Output:

str{0} = E0
str{1} = E1
str{2} = E2
str{3} = 00
ispacked(str) = 0

strfind()

The function is prone to OOB access when the search start index is negative.
Example:

static const str1[] = "0123456789";
static const str2[] = "abcdefghij";
#pragma unused str1
// normally strfind() should return -1 since there's no "0" in "abcdefghij"
new result = strfind(str2, "5", false, -10);
printf("result = %d", result);

Output:

result = -7

strdel()

The function is prone to OOB access when the index of the first character to remove is negative.
Example:

static str1[] = "1234567890";
static str2[] = "abcdefghij";
printf("[before] str1: %s", str1);
printf("[before] str2: %s", str2);
strdel(str2, -11, 1);
printf("[after] str1: %s", str1);
printf("[after] str2: %s", str2);

Output:

[before] str1: 1234567890
[before] str2: abcdefghij
[after] str1: bcdefghij
[after] str2: abcdefghij

valstr()

The function doesn't take the buffer size into account (it doesn't have a size argument).
Actually I'm not sure if this is even considered to be a bug in this repo; all of the abovementioned bugs were found by close code inspection, but this one is really obvious (IMHO, one look at the valstr() header in string.inc should be pretty much enough to notice it) - and yet it's still not fixed here.
Anyway, this bug should be really easy to work around by adding a size = sizeof(dest) argument into FIXES_valstr() and using it there.

VVWVV commented

ispacked function fix:

stock ispacked(const string[])
{
    #emit lref.s.pri string
    #emit const.alt  ucharmax
    #emit geq
    #emit retn
    return 0; /* make compiler happy. */
}

For valstr, that would be a code-breaking change (the change could stop code compiling), and updates to user code to deal with that would result in incompatibilities between fixes.inc code and non-fixes.inc code. That's also really a programmer error, in that they didn't pass in a large enough buffer, rather than a semantic error in the performance of the function. The question is what is the correct behaviour even in other circumstances? If the length was given, and was insufficient, should it truncate or fail?

@VVWVV Care to make a pull request for that?

For valstr, that would be a code-breaking change (the change could stop code compiling)

You mean for emit-related code or for "regular" code?

That's also really a programmer error, in that they didn't pass in a large enough buffer

IMHO, it's rather a design error. Of course a scripter should provide a buffer big enough to fit the resulting string, but the one who implements the function is also supposed to make sure it verifies the arguments and if it isn't prone to buffer overflow, NULL pointer dereference etc. But yeah, there's probably no point in proving my point if design errors aren't considered errors here.

The question is what is the correct behaviour even in other circumstances? If the length was given, and was insufficient, should it truncate or fail?

Well, in pawn-3.2-plus I made valstr truncate the result because that's exactly what the other string functions do in such situations.
Besides, you said it's a programmer error if the buffer isn't big enough anyway.

No, for regular code, adding an extra parameter default sizeof parameter is not backwards-compatible.