This is a (slightly edited) copy of two articles I posted to the comp.std.c Usenet newsgroup on Thu 2023-06-01.
Message-ID: <87fs7afpcw.fsf@nosuchdomain.example.com>
Message-ID: <87bkhyfnm7.fsf@nosuchdomain.example.com>
Quick summary:
- The new generic string functions in C23 (
strchr
,strpbrk
,strrchr
,strstr
) could break some code is valid in C17 and earlier, specifically code that passes avoid*
orconst void*
argument. Such code is probably contrived and unlikely to appear in the real world.. A minor update to C23 could address this. - The
memchr()
generic function has a similar issue that's more likely to show up in real-world code, when a pointer to a type other thanvoid
is passed. I do not offer a solution to this, but the issue should be noted. - The underlying issue is that the implicit type conversions that are applied to function arguments are not applied to the operands of a generic selection, but these generic functions are intended to work like functions and to replace ordinary functions with the same names from earlier editions of the C standard.
Subject: Can the new generic string functions accept void*
arguments?
The latest draft of the upcoming C23 standard is:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3096.pdf
It introduces several type-generic functions in <string.h>
, replacing
normal functions of the same names: memchr
, strchr
, strpbrk
, strrchr
,
strstr
.
I'll use strchr()
as an example; the same applies to the other str*()
generic functions (but not to memchr()
).
But see below for a discussion of memchr()
.
The problem this solves is that calling strchr()
with a const char*
argument yields a (non-const
) char*
result that points into the array.
For example:
#include <stdio.h>
#include <string.h>
int main(void) {
const char s[] = "hello";
char *p = strchr(s, 'h');
*p = 'J'; // Undefined behavior
printf("%s\n", s); // Likely to print "Jello"
}
This makes it possible to obtain a non-const pointer to a const object without a pointer cast.
The C23 strchr()
generic function returns a char*
if the first argument
is a char*
, or a const char*
if the first argument is a const char*
.
The stateless search functions in this section (
memchr
,strchr
,strpbrk
,strrchr
,strstr
) are generic functions. These functions are generic in the qualification of the array to be searched and will return a result pointer to an element with the same qualification as the passed array. If the array to be searched isconst
-qualified, the result pointer will be to aconst
-qualified element. If the array to be searched is notconst
-qualified, the result pointer will be to an unqualified element.
So far so good, and I definitely approve of this change. It does break
code that calls strchr()
with a const char*
argument and assigns the
result to a (non-const
) char*
object. That's IMHO a minor issue, and
arguably breaking such code is part of the point of the change. (Making
string literals const
would be similar, but I suppose that's still a
bridge too far.)
But I've thought of a way in which this could break some existing valid
code, namely code that passes a void*
or const void*
argument to
strchr()
.
Currently (in C17 and earlier), since void*
can be implicitly
converted to char*
and vice versa, such a call is valid. (I can't
think of a good reason to write such a call, but my imagination is
not unlimited.)
Question: Is this a valid call in C23? (It's valid in C17.)
char hello[] = "hello";
void *p = strchr((void*)hello, 'h');
An implementation of the generic strchr()
will presumably use a generic
selection in a macro definition. If the generic selection covers only
types char*
and const char*
, the call will violate a constraint. If it
also covers void*
and const void*
, the call will be valid.
The current wording in N3096 suggests that only char*
and const char*
are covered, implying that a call with a void*
or const void*
argument
is a constraint violation.
I suggest that the C23 standard should specify whether void*
arguments are valid or not. I have a slight preference for making
them valid. If so, the simplest approach would be for strchr()
to return a char*
given a char*
or void*
argument, or a const char*
given a const char*
or const void*
argument.
Here's the second article I posted to comp.std.c, after realizing
that there's also an issue with memchr()
.
Just after I posted the above, I thought of a potential issue with
memchr()
that just might affect real code.
In C17 and earlier, memchr()
has this declaration:
void *memchr(const void *s, int c, size_t n);
Given the implicit conversions between void*
and other object pointer
types, the first argument can be a pointer to any const
object type.
This is something that might plausibly be used in practice, unlike
(I think) passing a void
pointer to the str*()
functions.
It's probably impractical to fix this, since it would require
the generic selection to cover all possible object pointer types.
Any code that depends on the current behavior would have to add
(void*)
or (const void*)
casts to ensure that the type actually
matches.
For example, this (contrived) program is valid in C17 and earlier:
#include <stdio.h>
#include <string.h>
int main(void) {
const unsigned u = 0x12345678;
printf("u = 0x%x", u);
unsigned char *p = memchr(&u, 0x34, sizeof u);
if (p != NULL) printf(", p points to 0x%x", *p);
putchar('\n');
}
The output is:
u = 0x12345678, p points to 0x34
(Conceivably p
might be a null pointer if unsigned int
has padding
bits that cause 0x34
not to be stored in a single byte. That's not
particularly relevant to the point.)
A call to memchr()
with a char*
argument is, I suspect, more
likely to appear in real code.
The underlying issue is that the implicit conversions that happen with
function arguments do not happen with operands of a generic selection.
(The generic functions in <tgmath.h>
are defined in a way that this
isn't an issue, as far as I can tell, though I haven't looked closely
at the C23 updates.)