/lc3-vm

Little Computer 3 virtual machine written in go.

Primary LanguageGoApache License 2.0Apache-2.0

LC3 Virtual Machine implemented in Go.

Little Computer 3, or LC-3, is a type of computer educational programming language, an assembly language, which is a type of low-level programming language. It features a relatively simple instruction set, but can be used to write moderately complex assembly programs, and is a viable target for a C compiler. The language is less complex than x86 assembly but has many features similar to those in more complex languages. These features make it useful for beginning instruction, so it is most often used to teach fundamentals of programming and computer architecture to computer science and computer engineering students. The LC-3 was developed by Yale N. Patt at the University of Texas at Austin and Sanjay J. Patel at the University of Illinois at Urbana–Champaign. Their specification of the instruction set, the overall architecture of the LC-3, and a hardware implementation can be found in the second edition of their textbook.[1] Courses based on the LC-3 and Patt and Patel's book are offered in many computer engineering and computer science departments.

Architectural specification

The LC-3 specifies a word size of 16 bits for its registers and uses a 16-bit addressable memory with a 216-location address space. The register file contains eight registers, referred to by number as R0 through R7. All of the registers are general-purpose in that they may be freely used by any of the instructions that can write to the register file, but in some contexts (such as translating from C code to LC-3 assembly) some of the registers are used for special purposes. Instructions are 16 bits wide and have 4-bit opcodes. The instruction set defines instructions for fifteen of the sixteen possible opcodes, though some instructions have more than one mode of operation. Individual instructions' execution is regulated by a state machine implemented with a control ROM and microsequencing unit. The architecture supports the use of a keyboard and monitor to regulate input and output; this support is provided through memory mapped I/O abstractions. In simulation, these registers can be accessed directly, and the architectural specification describes their contents. Higher-level I/O support is also provided through the use of the TRAP instruction and a basic operating system. The operating system provides functions to read and echo characters from the keyboard, print individual characters to the monitor, print entire strings in both packed and unpacked forms, and halt the machine. All data in the LC-3 is assumed to be stored in a two's complement representation; there is no separate support for unsigned arithmetic. The I/O devices operate on ASCII characters. The LC-3 has no native support for floating-point numbers. The hardware implementation given in the Patt and Patel text is not pipelined or otherwise optimized, but it is certainly possible to create a fast implementation using more advanced concepts in computer architecture.

Instruction set

The LC-3 instruction set implements fifteen types of instructions, with a sixteenth opcode reserved for later use. The architecture is a load-store architecture; values in memory must be brought into the register file before they can be operated upon. Arithmetic instructions available include addition, bitwise AND, and bitwise NOT, with the first two of these able to use both registers and sign-extended immediate values as operands. These operations are sufficient to implement a number of basic arithmetic operations, including subtraction (by negating values) and bitwise left shift (by using the addition instruction to multiply values by two). The LC-3 can also implement any bitwise logical function, because NOT and AND together are logically complete. Memory accesses can be performed by computing addresses based on the current value of the program counter (PC) or a register in the register file; additionally, the LC-3 provides indirect loads and stores, which use a piece of data in memory as an address to load data from or store data to. Values in memory must be brought into the register file before they can be used as part of an arithmetic or logical operation. The LC-3 provides both conditional and unconditional control flow instructions. Conditional branches are based on the arithmetic sign (negative, zero, or positive) of the last piece of data written into the register file. Unconditional branches may move execution to a location given by a register value or a PC-relative offset. Three instructions (JSR, JSRR, and TRAP) support the notion of subroutine calls by storing the address of the code calling the subroutine into a register before changing the value of the program counter. The LC-3 does not support the direct arithmetic comparison of two values. Computing the difference of two register values requires finding the negated equivalence of one register value and then, adding the negated number to the positive value in the second register. The difference of the two registers would be stored in one of the 8 registers available for the user. Because there is no room left in the LC-3 instruction set for dedicated port-mapped I/O instructions, hardware implementations typically reserve part of the memory map for memory-mapped I/O.[2]

Programming language support

While it has not been implemented on a physical chip, the LC-3 can be used in simulation on Linux/Unix, Mac OS X and Windows environments. The simulation tools include an assembler with support for computerized offset computation with labels and the insertion of constants, strings, and blank memory locations into a block of assembly code. There is also a convention for using the C language on the LC-3. A sample assembler, compiler, and simulator are provided by McGraw-Hill.[1]

C and the LC-3

The calling convention for C functions on the LC-3 is similar to that implemented by other systems, such as the x86 ISA. When running C programs, the architecture maintains a memory model that includes space for a call stack and dynamic memory allocation. In this model, four of the processor's eight general purpose registers take on special roles: R4 is used as a base register for loading and storing global data, R5 is used to point to the current function's area on the call stack, and R6 is used as a stack pointer. Additionally, R7 is usually reserved for storage of return addresses from function calls; the JSR, JSRR, and TRAP instructions automatically store return addresses in this register during their execution. When a C function is called under this model, the function's parameters are pushed onto the stack right to left. Space is then made on the stack for the return value of the function being called, the address of the instruction in the caller to return to, and the caller's value of R5. Local variables in the function being called are pushed onto the stack in the order that they are declared. Note that the LC-3 does not have native PUSH and POP instructions, so addition and memory storage instructions must be used separately to implement the stack.

to be continue...