/AyeC-Compiler

Compiler Project at Uppsala University

Primary LanguageJava

AyeC-Compiler

Compiler Project at Uppsala University

This project has been made by Anton Weber and Davide Berdin to fulfil the course of Compiler Project. AyeC compiler will understand a subset of the C language called uC Language.

Credits for the page goes to Alexandra Jimborean

<title>The uC language</title> <script src="ui/slides.js" type="text/javascript"></script><style type="text/css"></style>

The uC language

uC is a small subset of C, corresponding to a typical imperative, procedural language. The following sections describe in more detail which language elements are present.

Every uC program is also a valid C program. The syntax and semantics of uC is the same as that for full C, within the restrictions described here.

Lexical elements

  • Decimal integer constants and character literals. A character literal contains either a single printable character, or the \n escape sequence (line break). A character literal denotes an integer constant whose value is the representation code of the character.
  • Alphanumeric identifiers: non-empty sequences of letters or digits starting with a letter. An underscore is treated as a letter.
  • Keywords: char, else, if, int, return, void, and while.
  • Special symbols: !=, !, &&, (, ), *, +, , (comma), -, /, ;, <=, <, ==, =, >=, >, [, ], {, }.
  • White space characters: blank (32), newline (10), carriage return (13), form feed (12), and tab (9). The numbers are the ASCII representation codes for the characters.

    Comments: /* followed by anything and terminated by */, and // followed by anything and terminated by end of line.

Syntax

  • Primary expressions: constants, identifiers, function calls, array indexing, and expressions within parentheses.

    Unary expressions with the ! and - unary operators.

    Binary expressions with the +, -, *, /, <, >, <=, >=, ==, !=, &&, and = operators.

    Statements: expression statements, the empty statement, if statements with or without else, while statements, return statements, and compound statements (blocks), i.e., statements enclosed in { }.

    Local variable declarations are only permitted at the top-level function body block, not in nested blocks.

  • Variable declarations: base type followed by variable name, and for arrays followed by the array size (an integer constant) in square brackets.

    Multi-dimensional arrays, pointers, and structures are not included.

    Initializes in variable declarations are not included.

  • Function definitions: return type or void, function name, parameter list, and body (compound statement) in that order.

    The parameter list is either void, meaning no parameters, or a sequence of variable declarations separated by , (comma). An array parameter in a function head is written without array size, i.e., with only the brackets.

    An external (library) function can be declared by writing a function definition without body, terminated with a ; (semi-colon).

    Variable-arity functions are not included.

Program execution

  • Execution starts at the user-defined function main which takes no parameters and returns int. Execution ends when main returns.
  • The standard library is uC-specific since uC excludes variable-arity functions, and this makes printf and scanf-like functions impossible. To use the library, include the following declarations at the start of your uC source file:
        void putint(int i);       // prints to stdout
        void putstring(char s[]); // prints to stdout
        int getint(void);         // reads from stdin
        void getstring(char s[]); // reads from stdin

Notes

  • Array parameters to functions are passed by reference, as in full C, but the formal parameter still behaves like an array variable and not as a pointer variable. For example:
    void f(int a[], int b[])
    {
        a[3] = 27;  // legal
        a = b;      // illegal in uC, legal in full C
    }
    

Example

/* This is an example uC program. */
void putint(int i);

int fac(int n)
{
    if (n < 2)
        return n;
    return n * fac(n - 1);
}

int sum(int n, int a[])
{
    int i;
    int s;

    i = 0;
    s = 0;
    while (i < n) {
        s = s + a[i];
        i = i + 1;
    }
    return s;
}

int main(void)
{
    int a[2];

    a[0] = fac(5);
    a[1] = 27;
    putint(sum(2, a)); // prints 147
    return 0;
}

Informal grammar for uC

This is an informal context-free grammar for uC:

  • The start symbol is program.
  • Keywords and special symbols are written within double-quotes.
  • /empty/ denotes the empty string.
  • intconst and ident denote classes of lexical elements.
  • Associativity and precedence for expression operators is not expressed.
  • The grammar has not been adjusted to fit any particular parsing method.
program         ::= topdec_list
topdec_list     ::= /empty/ | topdec topdec_list
topdec          ::= vardec ";"
                  | funtype ident "(" formals ")" funbody
vardec          ::= scalardec | arraydec
scalardec       ::= typename ident
arraydec        ::= typename ident "[" intconst "]"
typename        ::= "int" | "char"
funtype         ::= typename | "void"
funbody         ::= "{" locals stmts "}" | ";"
formals         ::= "void" | formal_list
formal_list     ::= formaldec | formaldec "," formal_list
formaldec       ::= scalardec | typename ident "[" "]"
locals          ::= /empty/ | vardec ";" locals
stmts           ::= /empty/ | stmt stmts
stmt            ::= expr ";"
                  | "return" expr ";" | "return" ";"
                  | "while" condition stmt
                  | "if" condition stmt else_part
                  | "{" stmts "}"
                  | ";"
else_part       ::= /empty/ | "else" stmt
condition       ::= "(" expr ")"
expr            ::= intconst
                  | ident | ident "[" expr "]"
                  | unop expr
                  | expr binop expr
                  | ident "(" actuals ")"
                  | "(" expr ")"
unop            ::= "-" | "!"
binop           ::= "+" | "-" | "*" | "/"
                  | "<" | ">" | "<=" | ">=" | "!=" | "=="
                  | "&&"
                  | "="
actuals         ::= /empty/ | expr_list
expr_list       ::= expr | expr "," expr_list

Expression operator precedence table

Prefix unary operators

14: - !

Infix operators

13L: * /

12L: + -

10L: < > <= >=

9L: == !=

5L: &&

2R: =

The numbers to the left indicate precedence; larger numbers indicate higher precedence. L indicates left-associative operators and R indicates right-associative operators. The table only contains the C operators that are included in uC.