Datbac thesis

This is my thesis, it uses RMarkdown and a very customized pandoc template.

Abstract

There are many approaches to teaching operating systems. Some courses have students implement a system on real hardware; others may settle for simulating components such as schedulers or virtual memory management. Both approaches have advantages and disadvantages.

The first approach has the advantage that it can accurately reflect the challenge of implementing a real operating system, because it is a real operating system. The disadvantage is that students are required to deal with low-level complexities of memory management, not only for processes, but also for the operating system itself.

The latter approach can avoid this by using a higher level of abstraction and a higher-level language. This results in a tradeoff however; reduced complexity in one place (managing kernel resources) regrettably leads to reduced complexity in other interactions.

A hybrid approach is to emulate the execution of processes and implement operating system components in a language and environment that is user-friendly. With this approach, students can build components with access to all system calls and the libraries they may want, while also exploring interactions between components and processes.

Using a user-mode emulator aside a high-level language to teach operating systems is not a novel concept, but popular implementations use only a single core. This is unfortunate, as multicore operating systems face other challenges than their single core counterparts.

State synchronisation of virtualised resources is an example of such a challenge. More specifically, synchronisation of Translation Lookaside Buffers (TLBs) is a challenge that simply does not arise on single core systems.

This project delivers Gotos; a framework that provides the necessary complexity that operating systems naturally deal with, but without excluding the intended audience: third-semester students at UiS. A multicore RISC-V emulator is included to provide the required complexity of managing actual processes with resources. However, the emulator only executes in user/application mode, delegating traps to the surrounding system written in a high-level, garbage-collected programming language, executing on real hardware.

Implementing OS components in a high-level language simplifies kernel resource management, allowing students to focus on more fundamental concepts of processor virtualisation, limited direct execution, memory virtualisation, and protection. The emulator enables students to explore how their implementations affect, and are affected by real workloads.

We contribute empirical proof that a multicore emulated architecture is also possible; one that can provide the added complexity of writing a multicore operating system while staying simple enough that undergraduate students can realistically understand the fundamentals within a semester or two.

Compiling this project

You will need a LaTeX environment on your system. If you’re on Linux, the easiest way to accomplish this is to install texlive-full or whatever variation is available for your system. On my system (Fedora 35), the package is named texlive-schema-full and I install it by running sudo dnf install texlive-schema-full.

You will likely also need pandoc. On my system, I run sudo dnf install pandoc.

If you happen to be using an older version of pandoc, you may also need pandoc-citeproc.

You will need to have R installed on your system as well as a few packages:

In R: sudo R
install.packages("rmarkdown")
install.packages("bookdown")
install.packages("kableExtra")
install.packages("tufte")

It is likely that the final line will give you some errors.

These additional commands should install the missing dependencies:

RUN dnf install libcurl
RUN dnf install libcurl-devel
RUN dnf install libxml2-devel
RUN dnf install openssl
RUN dnf install openssl-devel

You should now be able to run make from the source directory. The resultant pdf-file should be generated as output/_main.pdf. This does not include the glossary however. If you want the glossaries, run make pdf-final. This performs two passes over the document to include the glossary list.

Using Docker (currently broken)

Note
For preservation only, not recommended for regular use.

The docker solution is dirty, but in return, it should work on any system. It uses a lot of space and the image has to be rebuilt for each time the source is changed. Thanks to caching, only the last 2 commands in the dockerfile consume much time after the initial image is built.

To minimize time spent on building, it may help to have a cached version of the image where all the packages are installed:

docker build . --tag base-doc # warning, this takes a very long time

As long as this base-doc image is kept, future runs of docker build should take much less time.

docker build . --tag tmp --build-arg GITHUB_TOKEN=${GITHUB_TOKEN} # warning, this takes a very long time on first run.
docker run tmp > output.pdf # Should clone the latest version of the repo and make it