/hello-os

Opening my GitHub account by writing a simple operating system in Assembly Language.

Primary LanguageAssemblyMIT LicenseMIT

Hello Operating System - Part I

Creating a 16-Bit Real Mode Application Using Assembly Language

Introduction

Ever since Brian Kernighan published Programming in C - A Tutorial in 1974, the "Hello, World!" program has become the Lorem Ipsum of the computer world. Therefore, it would make sense to open this repository by saying, "Hello", as well.

I've written operating systems in Linux using tools such as dd, qemu, and bochs, but I haven't found much literature for writing and running an OS in Windows. So, using Nick Blundell's wonderful paper, Writing a Simple Operating System - from Scratch, as a template, we'll write an OS, using Windows tools, that says, "Hello, World!" upon booting up. Let's get started!

NOTE - In order to create this operating system, you must have a basic knowledge of assembly language. If not, I suggest searching the Internet for "assembly language tutorial" or purchasing an introductory text from Amazon. Unfortunately, my favorite book, Peter Norton's Assembly Language Book for the IBM PC, is out of print.

Step 1 - Understanding the Boot Process

All x86-based computers start in Real Address Mode, which allows programs to directly access the computer's addressable memory (up to 1 MB) and input/output (I/O) ports. When you press the computer's power button, the Power Supply Unit (PSU) wakes up the computer's built-in Basic I/O System (BIOS). The BIOS uses Real Mode to test the computer's memory and peripherals devices and then looks for a bootloader. The purpose of a bootloader is to collect information about the computer and pass it on to the operating system, which will take control of the computer from the BIOS. The operating system, which runs in Protected Mode, "manages" calls between applications and hardware, preventing errors that may occur when applications directly access memory or hardware.

Step 2 - Getting the Tools

Real Mode uses a 16-bit data bus, which means it uses 16-bit registers, which means we will have to write our program in 16-bit assembly language, as opposed to 32- or 64-bit. While many assemblers are capable of compiling 16-bit code, we are going to use Tatham and Co.'s Netwide Assembler (NASM) for our project. We will need Qbix and Co.'s DOSBox as well. Please click on the hyperlinks below to download and install these programs now:

Step 3 - Writing the Code

Open the Run dialog by pressing the Windows Key ⊞ and R simultaneously. Type cmd in the box that appears and hit Enter.

Run Dialog Box

Create a folder named hello_os on your local disk. We will store our code in this directory, and it will also act as a local drive for DOSBox later on.

Create the hello_os Directory

Type in notepad hello.asm and hit Enter. If you are prompted to create a new file, click Yes:

Create New File Dialog

Enter in the text below in the Notepad window:

BITS 16                   ;Let NASM know to use 16 bit mode
org 0x100                 ;.com files always start 256 bytes into the segment
mov si,msg                ;Move the address of msg into the SI register
mov ah,0x0e               ;Teletype function code for INT 10
startloop:                ;Loop start point label
lodsb                     ;Load the char at the SI address into AL and go to
                          ;the next char
cmp al,0x00               ;Compare AL to 0
je endloop                ;Jump to endloop if AL equals 0
int 0x10                  ;Call INT 10 to print the char in AL to the screen
jmp startloop             ;Jump back to the loop start point
endloop:                  ;End of the loop label
ret                       ;Quit the program
msg: db 'hello, world',0  ;Bytes to print

Notice that we are using INT 10H and a loop to print one character at a time to the screen, instead of using INT 21 or INT 80 to print the whole "hello, world" string. The reason is that INT 21H is a call to the MS-DOS API and INT 80H is a call to the UNIX API, neither of which are available to the BIOS, unlike INT 10H.

Save the and close Notepad. Now, input in the following command:

c:\nasm\nasm.exe -f bin -o hello.com hello.asm

NASM will compile and link our code into an executable COM file. Once NASM is complete, you should see the following list of files if you input dir:

NASM Results

Step 4 - Running the Program

Input hello.com at the prompt and you should get an error, similar to the one below:

Unsupported 16-Bit Application Error

This is because modern Windows runs in 32-bit Protected Mode or greater. Click OK and exit the command prompt by typing in exit. Start DOSBox and input mount c c:\hello_os at the Z prompt and hit Enter:

Mounting the hello_os Directory to a Drive in DOSBox

Input c: to get to the hello_os directory, and then input dir to see the files and their sizes. Now, when you input hello, you will see our greeting!

Running the Application in DOSBox

Notice the size of hello.com. It is only 28 bytes; smaller than hello.asm. That is because NASM removed all the comments and left only the machine code:

BE 0F 01 B4 0E AC 3C 00 74 04 CD 10 EB F7 C3 68 65 6C 6C 6F 2C 20 77 6F 72 6C 64 00

Which means:

0000:0100  BE0F01                      MOV    SI,010F
0000:0103  B40E                        MOV    AH,0E
0000:0105  AC                          LODSB
0000:0106  3C00                        CMP    AL,00
0000:0108  7404                        JE     E
0000:010A  CD10                        INT    10
0000:010C  EBF7                        JMP    5
0000:010E  C3                          RET
0000:010F  68656C6C6F2C20776F726C6400  hello,world

For those of you new to assembly language programming, note the commands at addresses 0108 and 010C. 0x74 is the machine code for Jump-If-Equal (JE). If the following hexadecimal number is less than 0x80, the processor jumps forwards. If the number is greater than or equal to 0x80, the processor jumps backwards. Since 0x04 is less than 0x80, the processor will jump forward 4 bytes FROM THAT POINT, not from 0108, i.e., address 0108 + 2 bytes for JE = Jump 4 bytes from 010A = address 010E.

At 010C, EB is the machine code for Jump. Since 0xF7 is greater than 0x80, the jump will be backwards. However, if 0x00 is 0, then 0xFF is -1, so F7 is -9. Therefore, the processor will jump 9 bytes backwards FROM THAT POINT, not 010C, i.e., address 010C + 2 bytes for JMP = Jump backwards 9 bytes from 010E = address 0105. For more information, check out Daniel Sedory's great explanation of this process.

Input exit to leave DOSBox.

Yay! You have created and executed a 16-bit program using Windows tools! Our next step will be to use this code as a basis for our bootloader, but we'll save that for Hello, World - Part II