Chapter 3: Base-bones scaffolding

kmain.cc

The easiest piece of this component is our kernel basic code. At first, all we need it to do is spin forever. This will go in src/kmain.cc.

We need to make the kmain name visible to the bootloader; C++ name mangling means that otherwise, kmain will be linked as something like _Z5kmainv. Rather than trying to to figure out what name will be and making sure that name is known to the bootloader, we can just tell the linker to turn off mangling for this function using extern "C".

extern "C" {
    void kmain();
}

void
kmain()
{
        while(true) ;
}

boot.S

The bootloader starts at the beginning of the text segment.

.section ".text.boot"

.globl _start

_start:

To make things easier, we’re going to tell the last three cores to halt. To do that, we need need to query the control processor to figure out which processor we are. There’s only four processors, so we AND by 3 to clear out any other bits. Then, if the processor isn’t #0, jump on over to the halt label.

This is discussed on page 4-92 of [CA72TRM].

mrs        x1,     mpidr_el1
and        x1,     x1, #3
cmp        x1,     #0
bne        halt

Now, set up our stack at 0x80000, which is the address the kernel will be loaded into in memory. This is set by the Raspberry Pi bootloader.

mov        sp,     #0x80000

The base segment (.bss) is where any statically allocated variables that have been declared, but not initialized. So, we should initialize them. We do that by starting at the beginning of the BSS section, and continuing until we’ve iterated __bss_size times.

        ldr        x1,     =__bss_start
        ldr        w2,     =__bss_size
zero_bss:

str is the mnemonic for Store Register. We store the contents of the xzr [1] register into the address pointed to by x1 - which we initialized with the start of the BSS section. We add 8 to x1 — 64 bits — and subtract one from the size of the BSS. If this is not zero, we keep going. If it is, we’ll branch with link (bl) to the label kmain. The bl mnemonic tells the assembler that this is probably calling some routine. It is an unconditional jump, and the expectation here is that we never return.

str        xzr,    [x1],   #8
sub        w2,     w2,     #1
cbnz       w2,     zero_bss

bl kmain

The halt block loops through an endless loop, waiting for events (wfe).

halt:
        wfe
        b halt

The linker script

This is honestly taken almost directly from the Raspberry Pi forums. I’ll take a stab at explaining it, and hope that this forces me to understand it. This script has three parts to it: a pointer to the entry point (e.g. where in boot.S it should begin execution), a block describing what memory is avilable and how much of it there is, and then a block describing the sections of memory.

We start by setting the entrypoint to the _start symbol defined previously.

ENTRY(_start)

The memory region definition follows. Quoting from the documentation [2]:

The syntax for 'MEMORY' is:
 MEMORY
   {
     NAME [(ATTR)] : ORIGIN = ORIGIN, LENGTH = LEN
     ...
   }

The NAME is a name used in the linker script to refer to the region.
The region name has no meaning outside of the linker script.

I don’t know if the memory section is strictly necessary here; the reference in the SECTIONS block could be replaced with a hardcoded number. The Raspberry Pi bootloader firmware (in the EEPROM) loads the kernel from memory location 0x80000, so that’ll get marked as the load address. I have 8 GB of RAM in the Pi 4, but for the sake of starting the kernel, I’ll mark the lowest standard amount of memory, which I think is 1GB. We also shouldn’t need to write to this memory, so it’s marked as read-execute only.

MEMORY
{
        LOAD (rx)  : ORIGIN = 0x000080000, LENGTH = 1g
}

Now we need to define our memory sections; we previously talked about one of them (BSS), but there are some others we’ll need to set up. The guiding principles here are to put symbols in the same region if they need to be initialized or if there’s a specific reason they need to be grouped together.

BSS is one such grouping: static variables that need to be initialized. Another grouping are static variables that have been initialized and must be loaded from memory. On microcontrollers (e.g. Cortex-M series systems), that might be from flash. This will be loaded from the SD card image, e.g. from SDRAM. On those systems, you would also pay attention to making sure constant data and code lives in read-only memory. This matters when you don’t have a lot of RAM, but for kOS, I’m not going to worry about that. We do have to mark certain areas for heaps. The common ones I’ve seen is

  • .text for code,

  • .bss for uninitialized data,

  • .stack for the stack, and

  • .data for initialized data.

It’s worth noting that there’s a spec; appendix 1 covers the reserved names.

SECTIONS
{

The current point in memory, aka the first block of memory, is set to the LOAD memory address from our previous definitions. Every section that follows continues from here. The KEEP directive ensures that the text.boot section keeps the that particular section at the beginning. Since our bootloader (boot.S) starts at text.boot, we want that to be where our memory starts. The linkonce directive says it should be linked in only once.

. = LOAD;
.text :
{
        KEEP(*(.text.boot))
                *(.text .text.* .gnu.linkonce.t*)
}

.rodata :
{
        *(.rodata .rodata.* .gnu.linkonce.r*)
}

PROVIDE(_data = .);

.data :
{
        *(.data
        .data.*
        .gnu.linkonce.d*)
}

The bss section is reserved, but there’s nothing to load from memory (because we are going to initialize it to zero). This section is accordingly marked as NOLOAD. It needs to be aligned to 16-bytes as per the spec.

        .bss (NOLOAD) :
        {
                . = ALIGN(16);
                __bss_start = .;
                *(.bss .bss.*)
                *(COMMON)
                __bss_end = .;
        }
        _end = .;

        /DISCARD/ : { *(.comment) *(.gnu*) *(.note*) *(.eh_frame*) }
}

Finlly, we define a __bss_size that we’ll use when initializing the BSS.

__bss_size = (__bss_end - __bss_start)>>3;

After building kospi64.elf, we can use objdump to view the sections - I’ve shortened the program header addresses to make them fit, but they are 64-bit addresses.

kyle@midgard:~/src/kospi64$ make
mkdir build
aarch64-none-elf-g++ -o build/boot.o -O2 -Wall -Werror -ffreestanding
>-march=armv8-a+crc -nostartfiles -nostdinc -nostdlib -Wno-unused-
>command-line-argument -Iinc -c -I src src/boot.S
aarch64-none-elf-g++ -o build/main.o -c -std=c++17 -O2 -Wall -Werror
>-ffreestanding -march=armv8-a+crc -nostartfiles -nostdinc -nostdlib
>-Wno-unused-command-line-argument -Iinc -I src src/main.cc
aarch64-none-elf-ld -o build/kospi64.elf -nostdlib --no-undefined
>build/boot.o build/main.o -Map build/kernel8.map -T pi4.ld
aarch64-none-elf-objcopy build/kospi64.elf -O binary build/kernel8.img
kyle@midgard:~/src/kospi64$ aarch64-none-elf-objdump -x build/kospi64.elf

build/kospi64.elf:     file format elf64-littleaarch64
build/kospi64.elf
architecture: aarch64, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x0000000000008000

Program Header:
    LOAD off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**16
         filesz 0x00008044 memsz 0x00008044 flags r-x
    LOAD off    0x00000000 vaddr 0x1f000000 paddr 0x1f000000 align 2**16
         filesz 0x00000000 memsz 0x00000000 flags rw-
private flags = 0x0:

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000044  0000000000008000  0000000000008000  00008000  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .bss          00000000  000000001f000000  000000001f000000  00010000  2**0
                  ALLOC
SYMBOL TABLE:
0000000000008000 l    d  .text      0000000000000000 .text
000000001f000000 l    d  .bss       0000000000000000 .bss
0000000000000000 l    df *ABS*      0000000000000000 boot.o
000000000000802c l       .text      0000000000000000 halt
000000000000801c l       .text      0000000000000000 zero_bss
0000000000000000 l    df *ABS*      0000000000000000 main.cc
0000000000000000 g       *ABS*      0000000000000000 __bss_size
000000001f000000 g       .bss       0000000000000000 __bss_end
0000000000008000 g       .text      0000000000000000 _start
000000001f000000 g       .bss       0000000000000000 __bss_start
000000001f000000 g       .bss       0000000000000000 _end
0000000000008040 g     F .text      0000000000000004 kmain


kyle@midgard:~/src/kospi64$

The future

This will be enough to bootstrap a simple kernel. Later on, we’ll want to load a kernel from external media, whether the SD card, an SSD, or NVMe drive. With that in mind, the definitions above should be enough to get this basic functionaltiy working (in particularly the memory is overkill).

At this point we have a boot kernel, but no way to tell that it’s booting (except maybe with a JTAG probe), so the next step should be getting a serial console working.

Footnotes