/oryx

The Oryx programming language

Primary LanguageCBSD Zero Clause License0BSD

                       ──────────────────────────────────
                         Oryx — Programming Made Better
                       ──────────────────────────────────

Oryx is named after the oryx animal.  This means that when referring to
Oryx the programming language in languages other than English you should
use the given language’s translation of the animal’s name (e.g. ‘Órix’ in
Portuguese or ‘Όρυξ’ in Greek) as opposed to using the English name.

Oryx is intended to be a sane programming language for serious software
development.  To be more specific Oryx aims to be the ideal language for
general-purpose application development for modern systems.  We do not
waste our time attempting to support or perform well on legacy systems or
on your dishwasher.

Oryx assumes that the programmer is competent, and allows the programmer
to do what the programmer wishes to do without getting in their way.
Oryx rejects the notion that your tools need to be actively defensive,
and assume that you are a web developer that lacks real programming
skills.

Oryx also aims to be a very simple language.  Learning most of the
languages useful features should be possible within a day of
experimentation, and language features and syntax should be as consistent
and common-sense as possible.


                             ──────────────────────
                               Build Instructions
                             ──────────────────────

Building the Oryx compiler is rather trivial.  The steps are as follows:

1.  Install the LLVM libraries and -headers.  They should be available
    through your systems package manager.  Do note that as of 17/07/2024
    the version of LLVM being utilized is 18.1.x.  The compiler may work
    with other versions, but it isn’t guaranteed.

2.  Install Gperf.  It should be available in your systems package
    manager, and if not then you can easily find instructions online to
    build from source.  Gperf 3.1 is the oldest version that is actively
    tested on.

3.  Clone the compiler repository.

    $ git clone https://github.com/Mango0x45/oryx.git

4.  Bootstrap and run the build script.

    $ cc -o make make.c
    $ ./make  # See below for more details

If you followed the above steps, you should find the compiler located in
the root directory of the git repository under the name ‘oryx’.

The build script takes a few optional parameters that might be of
interest.  They are as follows:

    -F  Force rebuild the compiler and its dependencies in vendor/.
    -f  Force rebuild the compiler but not its dependencies in vendor/.
    -r  Build a release build with optimizations enabled.
    -S  Do not build with the GCC sanitizer.  This option is not required
        if -r was specified.

The build script also accepts some subcommands.  They are as follows:

    clean       Delete all build artifacts and compiled binaries.
    distclean   Delete all build artifacts and compiled binaries, as well
                as those creates by any dependencies in vendor/.
    test        Run the tests in test/.  This subcommand should only be
                run after a regular invocation of the build script so
                that the tests get compiled.


                         ──────────────────────────────
                           Existing Language Features
                         ──────────────────────────────

1.  The following datatypes are supported.  The unsized integer types
    default to the systems word size (typically 64 bits).  The rune type
    is an alias for the i32 type and serves a purely semantic purpose.
    In the future it will be a distinct type.

        /* Integer types */
        i8, i16, i32, i64, i128,  int
        u8, u16, u32, u64, u128, uint
        rune

        /* Floating-point types */
        f16, f32, f64, f128

2.  C-style block comments.  Line comments are intentionally not
    included.

3.  Declaration of mutable variables with optional type-inference.  The
    syntax is simple and consistent regardless of if type-inference is
    used or not.  Variables are also zero-initialized unless ‘…’
    (U+2026 HORIZONTAL ELLIPSIS) or ‘...’ is given as a value.

        x: int;       /* Declare a zero-initialized integer */
        x: int = 69;  /* Declare an integer and set it to 69 */
        x:     = 69;  /* Same as above but infer the type */
        x := 69;      /* Recommended style when inferring types */
        x: int = …;   /* Declare an uninitialized integer */
        x: int = ...; /* Same as above when Unicode is not possible */

    When declaring an uninitialized variable, the recommended style is to
    use U+2026 HORIZONTAL ELLIPSIS.  If you cannot bind that codepoint to
    your keyboard, you should investigate the key-remapping faculties of
    your text editor.  For example, (Neo)Vim users may try the following:

        inoremap ... …
        " or if you don’t like the above…
        inoremap <C-.> …

4.  Declaration of constant variables with optional type-inference
    including constants of arbitrary precision.  The syntax is
    intentionally designed to be consistent with mutable variable
    declaration.

    Constants are unordered, meaning that a constant may refer to another
    constant that is declared later in the source file.

        FOO: u8 : BAR
        BAR: u8 : 69;

        REALLY_BIG :: 123'456'789'876'543'210;

        pub my_func :: () int {
            return BAR;
        }

5.  Constants of arbitrary precision (overflow is not possible), with ‘'’
    (U+0027 APOSTROPHE) as an optional digit seperator.

        REALLY_BIG :: 123'456'789'876'543'210;

6.  No implicit type conversions between types.  This includes between
    different integer types which may have the same size (i.e. int and
    int64)

        pub my_func :: () {
            x: int = 69;
            y: i64 = x;  /* Compile-time error */
        }

7.  Nested functions are supported, but not closures.  Closures will
    never be supported in the language.

        /* Recall that constants (including functions!) can be declared
           in any order.  This lets us define inner *after* it gets
           called by the assignment to ‘x’. */
        outer :: () {
            x := inner(5);

            inner :: (x: int) int {
                return x;
            }
        }

8.  No increment/decrement operators.  The following functions both
    return 42 as the return values are parsed as (+ (+ 42)) and
    (- (- 42)) respectively.

        x := 42;

        returns_42 :: () int {
            return ++x;
        }

        returns_42′ :: () int {
            return --x;
        }

9.  Assignment statements (not expressions).  Unlike in C, you cannot put
    an assignment inside of an expression.

        return_42 :: () int {
            x := 4;
            y := 2;
            x = x*10 + y;
            return x;
        }

    Due to quirks of the language grammar identifiers may be wrapped in
    (arbitrary levels of) parenthesis, however assignments are only
    permitted if the left-hand side with parenthesis removed is a lone
    identifier.  The rationale behind this is to allow in the future
    assignments to expressions that return pointers.

        ((x))          = x*10 + y;  /* legal */
        (true ? x : y) = x*10 + y;  /* illegal */

10. Static local variables allow for block-scoped global variables.  This
    is useful for having function state persist across multiple calls.

        iota :: () int {
            static x := -1;
            x = x + 1;
            return x;
        }

        pub main :: () {
            zero  := iota();
            one   := iota();
            two   := iota();
            three := iota();
        }