vvaltchev/tilck

A note on pointer-to-void to pointer-to-function cast

PetarKirov opened this issue · 2 comments

This is a follow up on our discussion on [0] from the other day. I was interested in the details and did some research and I wanted to share what I found, which I hope may be of interest to you as well.

I checked several answers in StackOverflow and Reddit and then I went forward straight to the source - the C99 standard [1] and I didn't find any clause explicitly saying that casting between pointer to data and pointer to function may yield undefined behavior. It was just not listed explicitly as allowed in the main normative text:

6.3.2.3 Pointers
1. A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

It is here that pointer to function types are not mentioned and I think this paragraph is what causes some people to consider this undefined behavior.

[...]
5. An integer may be converted to any pointer type (Petar: including pointer-to-function?). Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

I interpret this to mean that if you know the address of a function, the compiler should happily implicitly convert the integer literal to the desired pointer to function type:

// Create a pointer to the function at address 128:
int (*fp)() = 128;

And as expected, compiling this with gcc with -Wall -std=c99 yielded no warnings for me.

  1. Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.

Given 5. and 6., one could conclude that if a given compiler provides with an integer type large enough to hold a pointer value (i.e.. intptr_t exists), then converting a pointer-to-data to an integer and then converting that number to a pointer-to-function should be allowed.

  1. A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the pointed-to type, the behavior is undefined.

This essentially is saying that you can store a pointer-to-function of one type in a pointer-to-function of a different type, as long as you don't try to call it through the wrong type.
(Initially I thought that this property (which I'm sure that you're well aware of) could save you from that cast, and that's why I'm mentioning it, but that's not the case. Anyways...)

Going a bit further, the Posix standard, which essentially operates on top of a subset of the C standard (I mean it requires things that in theory won't be true on all things that C targets), effectively requires that you to be able to convert between pointer-to-void and pointer-to-function in cases like dlsym [2]:

The ISO C standard does not require that pointers to functions can be cast back and forth to pointers to data. Indeed, the ISO C standard does not require that an object of type void * can hold a pointer to a function. Implementations supporting the XSI extension, however, do require that an object of type void * can hold a pointer to a function. The result of converting a pointer to a function into a pointer to another data type (except void *) is still undefined, however. Note that compilers conforming to the ISO C standard are required to generate a warning if a conversion from a void * pointer to a function pointer is attempted as in:

fptr = (int (*)(int))dlsym(handle, "my_function");

From [3]:

The standard defines two levels of conformance: POSIX conformance, which is a baseline set of interfaces required of a conforming system; and XSI Conformance, which additionally mandates a set of interfaces (the "XSI extension") which are only optional for POSIX conformance. XSI-conformant systems can be branded UNIX 03. (XSI conformance constitutes the Single UNIX Specification version 3 (SUSv3).)

And indeed, not only most C compilers allow it for typical (i.e. non-exotic and non-embedded targets), it is also mentioned in the C99 standard under Annex J (informative (Petar: note: not normative)) Portability issues, J.5 Common extensions:

J.5.7 Function pointer casts

  1. A pointer to an object or to void may be cast to a pointer to a function, allowing data to be invoked as a function (6.5.4).
  2. A pointer to a function may be cast to a pointer to an object or to void, allowing a function to be inspected or modified (for example, by a debugger) (6.5.4).

So, in summary, I believe that the reason that the C standard doesn't explicitly mandate this behavior has nothing to do with the strict aliasing rules, which were a late, poorly executed effort to catch-up in performance with Fortran, but with the fact that back then pure Harvard architectures [4] were something that C needed to support.

In some systems, there is much more instruction memory than data memory so instruction addresses are wider than data addresses.

[..]

Microcontrollers are characterized by having small amounts of program (flash memory) and data (SRAM) memory, and take advantage of the Harvard architecture to speed processing by concurrent instruction and data access. The separate storage means the program and data memories may feature different bit widths, for example using 16-bit-wide instructions and 8-bit-wide data. They also mean that instruction prefetch can be performed in parallel with other activities.

So in practice, what this means for Tilck is that as long as you only are targeting only modified Harvard architecture systems (main memory holds both instruction and data, but the CPU may have different internal data paths for instructions and data) and also keeping the ELF constructor section technique I think it's reasonable to say that you should be ok with a naive cast. Also I think that if you're targeting an architecture where indeed functions and data are in completely separate address spaces and use different bus widths, the current cast will be just as undefined behavior, as the naive one.

I don't have any preference about the code style, so you can leave it as it is, I wanted to share my findings, as I know you appreciate understanding things deeply like me ;)

Cheers,
Petar

Hi Petar,
thanks for the detailed research.
Overall, it makes sense to me, even if I'm far from wanting/trying to be a "language lawyer" in order to further pursue this issue.

I'd just like to quote a comment from dlopen's man page:
http://man7.org/linux/man-pages/man3/dlopen.3.html

           cosine = (double (*)(double)) dlsym(handle, "cos");

           /* According to the ISO C standard, casting between function
              pointers and 'void *', as done above, produces undefined results.
              POSIX.1-2003 and POSIX.1-2008 accepted this state of affairs and
              proposed the following workaround:

                  *(void **) (&cosine) = dlsym(handle, "cos");

              This (clumsy) cast conforms with the ISO C standard and will
              avoid any compiler warnings.

              The 2013 Technical Corrigendum to POSIX.1-2008 (a.k.a.
              POSIX.1-2013) improved matters by requiring that conforming
              implementations support casting 'void *' to a function pointer.
              Nevertheless, some compilers (e.g., gcc with the '-pedantic'
              option) may complain about the cast used in this program. */

The last time I checked this, maybe around 2013 before the "2013 Technical Corrigendum" become a thing, I remember (hopefully correctly) that, in the same man page, the officially recommended way was still to use the tricky cast. After that, at some point, I believe the comment has been updated along with the code in the example to use the "normal" cast instead of that monstrosity.

Anyway, that man page was the reason for me to get used with this counter-intuitive cast style for years. I'm happy to hear now that it's not really necessary anymore. Probably, given my super-conservative nature I'll continue to use this trick for a few more years until even "pretty old" compilers will be guaranteed to non treat this as "undefined behavior", but it's good to know that things have changed.

Thanks again for the research!
Vlad

P.S.

Thanks to https://web.archive.org, I was able to see how that same man page looked several years ago, back in 2012: https://web.archive.org/web/20120402182207/http://man7.org/linux/man-pages/man3/dlopen.3.html

In the same example, the code + comment were:

           /* Writing: cosine = (double (*)(double)) dlsym(handle, "cos");
              would seem more natural, but the C99 standard leaves
              casting from "void *" to a function pointer undefined.
              The assignment used below is the POSIX.1-2003 (Technical
              Corrigendum 1) workaround; see the Rationale for the
              POSIX specification of dlsym(). */

           *(void **) (&cosine) = dlsym(handle, "cos");

So, I'm happy that I remembered it correctly!

Digging a bit further, is kind of a mess.

  • C89 explicitly listed it as undefined behavior:

    A.6.2

    • A pointer to a function is converted to a pointer to an object or a pointer to an object is converted to a pointer to a function (3.3.4).

    Yet, it also lists it as a common extension:

    A.6.5.7 Function pointer casts

    A pointer to an object or to void may be cast to a pointer to a function, allowing data to be invoked as a function (3.3.4). A pointer to a function may be cast to a pointer to an object or to void , allowing a function to be inspected or modified (for example, by a debugger) (3.3.4).

  • C99 doesn't list it as undefined behavior and again only mentions it under common extensions:

    J.5.7 Function pointer casts

    1. A pointer to an object or to void may be cast to a pointer to a function, allowing data to be invoked as a function (6.5.4).
    2. A pointer to a function may be cast to a pointer to an object or to void, allowing a function to be inspected or modified (for example, by a debugger) (6.5.4).
  • In C++98-03 it is not undefined behavior, but illegal, meaning that the compiler is required to diagnose it.

  • In C++11 it's conditionally supported:

    5.2.10 Reinterpret cast

    Converting a pointer to a function into a pointer to an object type or vice versa is conditionally-supported.
    The meaning of such a conversion is implementation-defined, except that if an implementation supports
    conversions in both directions, converting a prvalue of one type to the other type and back, possibly with
    different cv-qualification, shall yield the original pointer value.

  • And certain versions of Posix even require it (in others it's mentioned only for dlsym):

    2.12.3 Pointer Types

    All function pointer types shall have the same representation as the type pointer to void. Conversion of a function pointer to void * shall not alter the representation. A void * value resulting from such a conversion can be converted back to the original function pointer type, using an explicit cast, without loss of information.

    Note:
    The ISO C standard does not require this, but it is required for POSIX conformance.