
Compiler Project at Uppsala University

Compiler Project at Uppsala University

This project has been made by Anton Weber and Davide Berdin to fulfil the course of Compiler Project. AyeC compiler will understand a subset of the C language called uC Language.

The uC language

uC is a small subset of C, corresponding to a typical imperative, procedural language. The following sections describe in more detail which language elements are present.

Every uC program is also a valid C program. The syntax and semantics of uC is the same as that for full C, within the restrictions described here.

Lexical elements

  • Decimal integer constants and character literals. A character literal contains either a single printable character, or the \n escape sequence (line break). A character literal denotes an integer constant whose value is the representation code of the character.
  • Alphanumeric identifiers: non-empty sequences of letters or digits starting with a letter. An underscore is treated as a letter.
  • Keywords: char, else, if, int, return, void, and while.
  • Special symbols: !=, !, &&, (, ), *, +, , (comma), -, /, ;, <=, <, ==, =, >=, >, [, ], {, }.
  • White space characters: blank (32), newline (10), carriage return (13), form feed (12), and tab (9). The numbers are the ASCII representation codes for the characters.

    Comments: /* followed by anything and terminated by */, and // followed by anything and terminated by end of line.


  • Primary expressions: constants, identifiers, function calls, array indexing, and expressions within parentheses.

    Unary expressions with the ! and - unary operators.

    Binary expressions with the +, -, *, /, <, >, <=, >=, ==, !=, &&, and = operators.

    Statements: expression statements, the empty statement, if statements with or without else, while statements, return statements, and compound statements (blocks), i.e., statements enclosed in { }.

    Local variable declarations are only permitted at the top-level function body block, not in nested blocks.

  • Variable declarations: base type followed by variable name, and for arrays followed by the array size (an integer constant) in square brackets.

    Multi-dimensional arrays, pointers, and structures are not included.

    Initializes in variable declarations are not included.

  • Function definitions: return type or void, function name, parameter list, and body (compound statement) in that order.

    The parameter list is either void, meaning no parameters, or a sequence of variable declarations separated by , (comma). An array parameter in a function head is written without array size, i.e., with only the brackets.

    An external (library) function can be declared by writing a function definition without body, terminated with a ; (semi-colon).

    Variable-arity functions are not included.

Program execution

  • Execution starts at the user-defined function main which takes no parameters and returns int. Execution ends when main returns.
  • The standard library is uC-specific since uC excludes variable-arity functions, and this makes printf and scanf-like functions impossible. To use the library, include the following declarations at the start of your uC source file:
        void putint(int i);       // prints to stdout
        void putstring(char s[]); // prints to stdout
        int getint(void);         // reads from stdin
        void getstring(char s[]); // reads from stdin


  • Array parameters to functions are passed by reference, as in full C, but the formal parameter still behaves like an array variable and not as a pointer variable. For example:
    void f(int a[], int b[])
        a[3] = 27;  // legal
        a = b;      // illegal in uC, legal in full C


/* This is an example uC program. */
void putint(int i);

int fac(int n)
    if (n < 2)
        return n;
    return n * fac(n - 1);

int sum(int n, int a[])
    int i;
    int s;

    i = 0;
    s = 0;
    while (i < n) {
        s = s + a[i];
        i = i + 1;
    return s;

int main(void)
    int a[2];

    a[0] = fac(5);
    a[1] = 27;
    putint(sum(2, a)); // prints 147
    return 0;

Informal grammar for uC

This is an informal context-free grammar for uC:

  • The start symbol is program.
  • Keywords and special symbols are written within double-quotes.
  • /empty/ denotes the empty string.
  • intconst and ident denote classes of lexical elements.
  • Associativity and precedence for expression operators is not expressed.
  • The grammar has not been adjusted to fit any particular parsing method.
program         ::= topdec_list
topdec_list     ::= /empty/ | topdec topdec_list
topdec          ::= vardec ";"
                  | funtype ident "(" formals ")" funbody
vardec          ::= scalardec | arraydec
scalardec       ::= typename ident
arraydec        ::= typename ident "[" intconst "]"
typename        ::= "int" | "char"
funtype         ::= typename | "void"
funbody         ::= "{" locals stmts "}" | ";"
formals         ::= "void" | formal_list
formal_list     ::= formaldec | formaldec "," formal_list
formaldec       ::= scalardec | typename ident "[" "]"
locals          ::= /empty/ | vardec ";" locals
stmts           ::= /empty/ | stmt stmts
stmt            ::= expr ";"
                  | "return" expr ";" | "return" ";"
                  | "while" condition stmt
                  | "if" condition stmt else_part
                  | "{" stmts "}"
                  | ";"
else_part       ::= /empty/ | "else" stmt
condition       ::= "(" expr ")"
expr            ::= intconst
                  | ident | ident "[" expr "]"
                  | unop expr
                  | expr binop expr
                  | ident "(" actuals ")"
                  | "(" expr ")"
unop            ::= "-" | "!"
binop           ::= "+" | "-" | "*" | "/"
                  | "<" | ">" | "<=" | ">=" | "!=" | "=="
                  | "&&"
                  | "="
actuals         ::= /empty/ | expr_list
expr_list       ::= expr | expr "," expr_list

Expression operator precedence table

Prefix unary operators

14: - !

Infix operators

13L: * /

12L: + -

10L: < > <= >=

9L: == !=

5L: &&

2R: =

The numbers to the left indicate precedence; larger numbers indicate higher precedence. L indicates left-associative operators and R indicates right-associative operators. The table only contains the C operators that are included in uC.