Bare metal arm ARMv7 and aarch64 ARMv8 UART minimal examples that run on QEMU or gem5. Uses the C standard library via Newlib built from Crosstool-NG. GDB setup just works. Tested on Ubuntu 18.04.
Clone, build and run on QEMU:
git clone https://github.com/************/newlib-examples cd newlib-examples git submodule update --recursive ./build-ctng && ./build-qemu && ./build-samples && ./run
Outcome: QEMU starts, go to the terminal and type the keys:
-
'a'
-
'b'
and watch QEMU reply as you type with:
97: a new alloc of 1 bytes at address 0x1A2E0 Heap end = 0x1B000 98: b new alloc of 2 bytes at address 0x1A2E0 Heap end = 0x1B000
To exit enter:
Ctrl-A X
This example illustrates the following Newlib syscall implementations:
-
read
to get user UART input -
write
to give UART output -
sbrk
formalloc
Use ARMv8 instead of ARMv7:
./build-ctng -aA && ./build-qemu -aA && ./build-samples -aA && ./run -aA
Shell 1:
./run -d
Shell 2:
./rungdb
Now QEMU is at the first instruction of startup_arm.S, and you can step debug as usual.
The address is 0x40000000
, which is what we set the ENTRY
to in link.ld. If you change that, it will still work: QEMU just parses the ELF entry field, places the code there in memory, and sets PC to it.
Run on gem5
instead of QEMU:
./build-gem5 && ./build-samples -g && ./run -g
On another shell:
./gem5-shell
The rebuild of the examples is necessary because the UART and entry addresses are different for the available gem5 machines, and we do not have magic memory loaded DTBs to take care of that for us.
Trace executed instructions:
./run -T Exec,-ExecSymbol -g less out/arm/m5out/trace.txt
GDB step debug:
./run -g ./rungdb -g
The default machine is VExpress_GEM5_V1
, but that can be changed with:
./build-samples -g -m RealViewPBX ./build-samples -g -m VExpress_GEM5_V1 ./run -g -m RealViewPBX ./run -g -m VExpress_GEM5_V1
A separate sample rebuild is required for each machine type.
TODO blows up with:
build/ARM/cpu/simple/atomic.cc:495: virtual Fault AtomicSimpleCPU::writeMem(uint8_t*, unsigned int, Addr, Request::Flags, uint64_t*): Assertion `!pkt.isError()' failed.
The trace contains just:
0: system.cpu T0 : 0x10000 : adcle r0, r0, #1048576 : IntAlu : Predicated False 500: system.cpu T0 : 0x10004 : ldrle r1, [r8, #-65] : MemRead : Predicated False 1000: system.cpu T0 : 0x10008 : strle r3, [r3, #-4063] : MemWrite : Predicated False
which is at the expected address, but not the first expected instruction from startup_aarch64, which must mean that the image is not being loaded properly into memory.
TODO: our example is printing newlines without automatic carriage return \r
as in:
enter a character got: a
We use m5term
by default, and if we try telnet
instead:
telnet localhost 3456
it does add the carriage returns automatically.
TODO: any advantage over QEMU? I doubt it, mostly using it as as toy for now:
Without runnint ./run
, do directly:
./rungdb -s
Then inside GDB:
load starti
and now you can debug normally.
Enabled with the crosstool-ng configuration:
CT_GDB_CROSS_SIM=y
which by grepping crosstool-ng we can see does on GDB:
./configure --enable-sim
Those are not set by default on gdb-multiarch
in Ubuntu 16.04.
Bibliography:
Since I had this compiled, I also decided to try it out on userland.
I was also able to run a freestanding Linux userland example on it: https://github.com/************/arm-assembly-cheat/blob/cd232dcaf32c0ba6399b407e0b143d19b6ec15f4/v7/linux/hello.S
It just ignores the swi
however, and does not forward syscalls to the host like QEMU does.
Then I tried a glibc example: https://github.com/************/arm-assembly-cheat/blob/cd232dcaf32c0ba6399b407e0b143d19b6ec15f4/v7/mov.S
First it wouldn’t break, so I added -static
to the Makefile
, and then it started failing with:
Unhandled v6 thumb insn
Doing:
help architecture
shows ARM version up to armv6
, so maybe armv6
is not implemented?
If you are really lazy, and don’t want to wait 10 minutes for the toolchain or QEMU to build, you can also do just:
sudo apt-get install gcc-arm-none-eabi qemu-system-arm ./build-samples -p arm-none-eabi- ./run -h
Shame on you.
gem5:
./build-gem5 -e debug -t debug ./run -e debug -D -g -t debug
Usually, when you have to explain something, it is already not cool, but here goes in any case.
This allows you to run C programs without an operating system, directly on bare metal, and use a subset of the C standard library.
This allows you to run possibly unmodified C programs directly on bare metal.
Furthermore, we build a completely pristine GCC from source via crosstool-ng, therefore dispensing any distro provided blobs.
Contains crosstool-ng defconfigs. To generate those, do:
# Generates the base config. ./build-ctng cd crosstool-ng ./ct-ng menuconfig ./ct-ng savedefconfig cp defconfig ../../ctng_defconfig/<yourname>
It is nice when thing just work.
But you can also learn a thing or two from how I actually made them work in the first place.
Enter the QEMU console:
Ctrl-X C
Then do:
info mtree
And look for pl011
:
0000000009000000-0000000009000fff (prio 0, i/o): pl011
On gem5, it is easy to find it on the source. We are using the machine RealView_PBX
, and a quick grep leads us to: https://github.com/gem5/gem5/blob/a27ce59a39ec8fa20a3c4e9fa53e9b3db1199e91/src/dev/arm/RealView.py#L615
class RealViewPBX(RealView): uart = Pl011(pio_addr=0x10009000, int_num=44)
Inside startup_aarch64.S there is a chunk of code called "NEON setup".
Without that, the printf
:
printf("got: %c\n", c);
compiled to a:
str q0, [sp, #80]
which uses NEON registers, and goes into an exception loop.
It was a bit confusing because there was a previous printf
:
printf("enter a character\n");
which did not blow up because GCC compiles it into puts
directly since it has no arguments, and that does not generate NEON instructions.
The last instructions ran was found with:
while(1) stepi end
or by hacking the QEMU CLI to contain:
-D log.log -d in_asm
I could not find any previous NEON instruction executed so this led me to suspect that some NEON initialization was required:
-
http://infocenter.arm.com/help/topic/com.arm.doc.dai0527a/DAI0527A_baremetal_boot_code_for_ARMv8_A_processors.pdf "Bare-metal Boot Code for ARMv8-A Processors"
-
https://community.arm.com/processors/f/discussions/5409/how-to-enable-neon-in-cortex-a8
-
https://stackoverflow.com/questions/19231197/enable-neon-on-arm-cortex-a-series
We then tried to copy the code from the "Bare-metal Boot Code for ARMv8-A Processors" document:
// Disable trapping of accessing in EL3 and EL2. MSR CPTR_EL3, XZR MSR CPTR_EL3, XZR // Disable access trapping in EL1 and EL0. MOV X1, #(0x3 << 20) // FPEN disables trapping to EL1. MSR CPACR_EL1, X1 ISB
but it entered an exception loop at MSR CPTR_EL3, XZR
.
We then found out that QEMU starts in EL1, and so we kept just the EL1 part, and it worked. Related: