/spl-jvm

A JVM bytecode compiler for the SPL (Simple Programming Language) designed at the THM University of Applied Sciences

Primary LanguageKotlin

SPL on the JVM

This compiler for the Simple Programming Language (SPL) translates programs into Java Virtual Machine Bytecode. The Simple Programming Language is used at the THM University of Applied Sciences during a Compiler Construction course as a simple – yet complete – programming language. Students are usually required to develop a compiler that translates SPL programs into machine code for the ECO32 RISC architecture. In this project, the input program is translated into JVM bytecode instead.

This project mainly serves as a learning experience for myself to better understand JVM bytecode and the JVM architecture.

Features

All of SPL's language features are supported. These are:

  • 32-bit integers and fixed sized arrays of integers (or other arrays).
  • globally scoped types and routines.
  • a single scope for stack-allocated variables per routine.
  • pass-by-value and pass-by-reference parameters.
  • basic control structures.
  • recursion.

While most of these features can be expressed in JVM bytecode in a comparatively straightforward manner, pass-by-reference semantics for primitive values (such as integers) is not natively supported by the JVM. This problem is solved by allocating a pool of mutable references on the stack, which are passed instead of primitives if required. When an integer value should be passed by reference, it is promoted to a reference value by packing it into an available pool reference. After the routine returns, the value is unpacked from the reference again and the local variable is updated.

Usage

Invoking the compiler is as simple as:

java -jar spl-jvm-1.0.jar \
  input.spl \
  output.jar

This compiles an SPL input file called input.spl into an executable JAR file. When executed using java -jar output.jar, the translated SPL program is executed on the JVM.

By default, only the static method Spl.main() is available as the single public member of the generated code. All other SPL procedures are translated into static private methods. If Java interop is desired, the --public flag can be passed to the compiler and all SPL procedures will be translated into public static methods, which are available for other Java programs.

The executable JAR generated by the compiler contains three CLASS files:

  1. Spl.class – the translated source program as a standalone class.
  2. SplLib.class – which contains a Java implementation SPL's standard library and the program entry point.
  3. IntRef.class – which is the class used to promote integer arguments for by-reference calls.

The advantage of generating an executable JAR is that the product of the compiler is a ready-to-use executable. If you wish to simply obtain the generated CLASS file, this is possible by passing the --class flag to the compiler. The compiler will then simply output a CLASS file translated from the SPL input program. This file will not contain a program entry point, the SPL standard library or the class used for reference parameters.

Limitations

There are two limitations when using this compiler or its output in order to execute SPL programs or use them as a Java software library.

  1. While SPL has fixed-size arrays, Java arrays are always dynamically sized. This is no problem in SPL, as the compiler verifies arrays passed to procedures during compile time as part of SPL's type system. However, the expected array size is neither present in a translated procedure's signature nor is it validated during runtime. Be cautious when interacting with generated SPL procedures from Java code.
  2. None of the graphics procedures in SPL's standard library are implemented in the Java version of the library. This limitation stems from the amount of time (or absence thereof) I am willing to invest into this project. The focus of this project lies on the translation of SPL into JVM bytecode.

(C) 2022, Niklas Deworetzki