PMunch/futhark

[question] libclang behavior on macros

Closed this issue · 1 comments

Hi there,
Asked this question the other day on Nim's discord #main, but I figure the question got buried.

  • Does libclang expand macros when parsing?
  • Is there a way to keep them from expanding, and still have their AST accessible in some way?

Context:
I want to do something similar to what futhark/opir is doing, but not exactly the same.
I'm looking for a way to document C code, using a Nim-based generator, and also auto-generating documentation for both MinC and Nim, but with an output that fits the formats understood by Docusaurus (ie: md/mdx/react)

I would have asked on libclang-nim repository, but you have issues disabled there. Didn't know where else to ask.

The problem here is that C macros are simple text expansion macros. They happen before the C parsing part of the program ever begins. libclang gives you the name and whether a macro is "function like" or not (and maybe the count of arguments), but apart from that it doesn't give you anything. This is because it can't parse the body of a macro as it technically can contain anything. Consider the following example of perfectly valid, albeit very strange C code:

#include <stdio.h>

#define WEIRDNESS argv) {

int main(int argc, char** WEIRDNESS
    printf("Hello world\n");
    return 0;
}

What would the AST of the WEIRDNESS macro even be in this case?

The way I deal with this in Futhark is that I grab the extents of the definition, then I open the file and read the code myself. With the body of the macro I then try to parse some known entities like number literals. This is by far the weakest point of Futhark and I'd love a better solution. Maybe Zig does something more clever as it basically does the same thing a Futhark internally when importing C modules. But IIRC it just tries to take the body of the macro out into a new file and parse that to see what shows up.