A C Compiler written in C
- C standard library
as
(tested with gnu-as and darwin-as [now clang-as], but any should be fine)ld
(any should be fine)- A C89 compiler to build this compiler with
The compiler implements C89, C99 and C11 (controllable via -std=c89/c99/c11
).
System libraries are fully supported, including ABI compatibility with constructs such as va_list
.
There are some major additions, listed below:
- Features added:
- Microsoft/Plan 9 struct extensions
- Lambdas/Objective-C style blocks
- Trailing return types on functions
- Namespace checking
- Standard support (not exhaustive):
- C89, C99
- C11
_Bool
,_Noreturn
,_Alignof
,_Alignas
_Generic
,_Static_assert
- Feature support:
- Floating point types
- Unsigned integers
char
,short
,long
and (on 64-bit only)long long
- Wide characters, wide strings
volatile
qualifierregister
storage class
- Common language extensions
__auto_type
- GNU extensions (see below for a complete list)
- GNU omitted middle operand in
?:
- GNU keywords -
asm
,inline
(in C89),typeof
,__extension__
,__attribute__
,__label__
,__alignof__
,__alignas__
- GNU Array-range designated initialiser
- GNU omitted middle operand in
- Computed
goto
/ address-of-label (&&label
) - GCC/Clang builtins (not an exhaustive list):
__builtin_unreachable
,__builtin_trap
__builtin_types_compatible_p
__builtin_constant_p
__builtin_frame_address
__builtin_expect
__builtin_choose_expr
__builtin_add_overflow
,__builtin_sub_overflow
,__builtin_mul_overflow
__builtin_frame_address
,__builtin_return_address
- Code-generation Support/Optimisations:
- Function inlining (
__attribute__((always_inline/noinline))
) - Position independent code generation (
-fpic
) - Position independent executable generation, permitting ASLR (
-fpie
/-pie
) - Overflow-trapping arithmetic (
-ftrapv
) - Undefined behaviour trapping (
-fsanitize=undefined
) - Stack protector (
-fstack-protector
,-fstack-protector-all
) - Symbol visibility (
-fvisibility=default/protected/hidden
/-f[no-]semantic-interposition
/__attribute__((visibility(...)))
) - DWARF Debug Symbols (
-g
/-gline-tables-only
)
- Function inlining (
^(parameters, ...) { body }
^T (parameters, ...) { body }
^T { body }
^ { body }
The syntax for lambdas/blocks is similar to that in Objective-C. Closing over external variables isn't implemented yet, nor is the __block
keyword.
Other forms with explicit return types, omitted parameters, and omitted parameters and return types are allowed.
When the return type is omitted, the return type is inferred from the first return statement in the body, or void
, if there are none.
The result of the expression is a function block pointer (T (^)(Args...)
), explicitly convertible to a function pointer.
#pragma ucc namespace expr_
This ensures that any declarations after this pragma
begin with expr_
, allowing you to enforce a namespace exported by each translation unit (.c
file).
See namespace.c for an example.
Summarised from the GNU C Extensions page.
Supported | Extension Name | Description |
---|---|---|
✅ Yes | Statement Exprs | Putting statements and declarations inside expressions. |
✅ Yes | Local Labels | Labels local to a block. |
✅ Yes | Labels as Values | Getting pointers to labels, and computed gotos. |
❌ No | Nested Functions | Nested function in GNU C. |
✅ Yes | Nonlocal Gotos | Nonlocal gotos. |
❌ No | Constructing Calls | Dispatching a call to another function. |
✅ Yes | Typeof | typeof: referring to the type of an expression. |
✅ Yes | Conditionals | Omitting the middle operand of a ?: expression. |
❌ No | __int128 | 128-bit integers-__int128. |
🖥️ 64-bit targets only | Long Long | Double-word integers-long long int. |
❌ No | Complex | Data types for complex numbers. |
❌ No | Floating Types | Additional Floating Types. |
❌ No | Half-Precision | Half-Precision Floating Point. |
❌ No | Decimal Float | Decimal Floating Types. |
✅ Yes | Hex Floats | Hexadecimal floating-point constants. |
❌ No | Fixed-Point | Fixed-Point Types. |
❌ No | Named Address Spaces | Named address spaces. |
✅ Yes | Zero Length | Zero-length arrays. |
✅ Yes | Empty Structures | Structures with no members. |
✅ Yes | Variable Length | Arrays whose length is computed at run time. |
❌ No | Variadic Macros | Macros with a variable number of arguments. #define f(a, b...) ... |
✅ Yes | Escaped Newlines | Slightly looser rules for escaped newlines. |
❌ No | Subscripting | Any array can be subscripted, even if not an lvalue. (This is intentionally not supported) |
✅ Yes | Pointer Arith | Arithmetic on void-pointers and function pointers. |
✅ Yes | Variadic Pointer Args | Pointer arguments to variadic functions. |
✅ Yes | Pointers to Arrays | Pointers to arrays with qualifiers work as expected. |
✅ Yes | Initializers | Non-constant initializers. |
✅ Yes | Compound Literals | Compound literals give structures, unions or arrays as values. |
✅ Yes | Designated Inits | Labeling elements of initializers. |
✅ Yes | Case Ranges | case 1 ... 9 and such. |
❌ No | Cast to Union | Casting to union type from any member of the union. |
✅ Yes | Mixed Declarations | Mixing declarations and code. |
🔎 Partial | Function Attributes | Declaring that functions have no side effects, or that they can never return. |
🔎 Partial | Variable Attributes | Specifying attributes of variables. |
🔎 Partial | Type Attributes | Specifying attributes of types. |
✅ Yes | Label Attributes | Specifying attributes on labels. |
✅ Yes | Enumerator Attributes | Specifying attributes on enumerators. |
❌ No | Statement Attributes | Specifying attributes on statements. __attribute__((fallthrough)); |
✅ Yes | Attribute Syntax | Formal syntax for attributes. |
✅ Yes | Function Prototypes | Prototype declarations and old-style definitions. |
✅ Yes | C++ Comments | C++ comments are recognized. |
✅ Yes | Dollar Signs | Dollar sign is allowed in identifiers. |
✅ Yes | Character Escapes | \e stands for the character ESC. |
✅ Yes | Alignment | Determining the alignment of a function, type or variable. |
✅ Yes | Inline | Defining inline functions (as fast as macros). |
✅ Yes | Volatiles | What constitutes an access to a volatile object. |
🛠️ asm WIP | Using Assembly Language with C | Instructions and extensions for interfacing C with assembler. |
✅ Yes | Alternate Keywords | __const__ , __asm__ , etc., for header files. |
✅ Yes | Incomplete Enums | enum foo; , with details to follow. |
✅ Yes | Function Names | Printable strings which are the name of the current function. |
✅ Yes | Return Address | Getting the return or frame address of a function. |
❌ No | Vector Extensions | Using vector instructions through built-in functions. |
✅ Yes | Offsetof | Special syntax for implementing offsetof. |
❌ No | __sync Builtins | Legacy built-in functions for atomic memory access. |
❌ No | __atomic Builtins | Atomic built-in functions with memory model. |
🔎 Partial | Integer Overflow Builtins | Built-in functions to perform arithmetics and arithmetic overflow checking. |
❌ No | x86 specific memory model extensions for transactional memory | x86 memory models. |
❌ No | Object Size Checking | Built-in functions for limited buffer overflow checking. |
🔎 Partial | Other Builtins | Other built-in functions. |
❌ No | Target Builtins | Built-in functions specific to particular targets. |
❌ No | Target Format Checks | Format checks specific to particular targets. |
❌ No | Pragmas | Pragmas accepted by GCC. |
✅ Yes | Unnamed Fields | Unnamed struct/union fields within structs/unions. |
🛠️ TLS WIP | Thread-Local | Per-thread variables. |
✅ Yes | Binary constants | Binary constants using the 0b prefix. |
ucc
can generate x86_64
assembly, and had partial support for MIPS
, but that's unmaintained at the moment. There are plans to add arm
too.
The code generator can target Linux-, Cygwin- and Darwin-based toolchains (handling differences in PLT calls, leading underscores, stack alignment, etc)
Constant folding and some small amount of optimisation is done, but nothing heavy (the feature/ir
branch plans to change this).
The ABI matches GCC and Clang's, or more specifically, the System V x86-64 psABI (modulo bugs, of which there is currently one - see nested_ret.c).
ucc
can also dump its AST, similarly to clang, with -emit=dump
.
make
If you plan on building the shim libc, or customising CFLAGS
:
./configure [CC=...] [CFLAGS=...] [LDFLAGS=...]
ucc
doesn't have a make install target yet. When run locally, ucc
will use its own include files for stdarg.h
, etc, but otherwise will use system includes and libraries.
POSIX 'cc' standard arguments, plus many additions, see ./ucc --help
for details.
- By-value argument passing for structure/union types (
feature/1st-class-struct-args
) long long
types on 32-bit archslong double
type- Complex types (
_Complex
) - Atomic types (
_Atomic
) - Thread local storage (
_Thread_local
,__thread
)
- The preprocessor will fail to expand the latter of several function macros all on the same line (such as glibc's
tgmath.h
) - The preprocessor can't handle function macros that cross several lines
__asm__
statements are incomplete (see branchfeature/asm
)
./ucc -o hello hello.c
- preprocess, compile, assemble and link hello.c into hello
./ucc -o- -S test.c
- output assembly code for test.c
./ucc -o- -S -emit=dump test.c
- show the abstract parse tree
./ucc -c test.c
- (preprocess) compile and assemble test.c -> test.o
./ucc -c test.s
- assemble test.s -> test.o (preprocessing and compilation are skipped)
./ucc test.c a.o -o out b.a
- preprocess + compile test.c, and link with a.o and b.a to form the executable out
./ucc a.o b.c -E
- preprocess b.c - a.o is ignored since it's not linked with