Version 1.0.0 · AArch64 (ARMv8-A) · Pure assembly (~1.5k SLOC) · MIT
A small bare-metal microkernel for AArch64 written entirely in GNU as
syntax. Targets QEMU virt + Cortex-A72 today; designed to port to
Raspberry Pi 4 with a small platform overlay.
HEPK is suitable for education, research, hobby OS development, and bare-metal experimentation on AArch64 development boards. It is not certified, formally verified, or independently audited.
Do not deploy HEPK as-shipped in any safety-critical or life-critical system — including but not limited to missile guidance, UAV flight control, automotive ADAS, medical devices, or industrial machinery. Such systems require certification under regimes like DO-178C, ISO 26262, IEC 61508, or equivalent, which HEPK has not undergone and is not designed for at this stage.
The MIT license disclaims all warranty. Please read it before using this code anywhere it could hurt someone.
| Subsystem | Status | Notes |
|---|---|---|
| AArch64 boot path (EL3 → EL2 → EL1) | ✅ | Single-core; other cores WFE |
| PL011 UART driver | ✅ | QEMU virt MMIO base hardcoded |
| GICv2 (Distributor + CPU interface) | ✅ | Single-core view |
| MMU (1 GiB block, identity map) | ✅ | I-cache + D-cache enabled |
| D-cache invalidate-by-set/way | ✅ | DC ISW iteration per CLIDR/CCSIDR |
| EL1 vector table + panic dump | ✅ | ESR/ELR/FAR printed for sync faults |
| Generic Timer (CNTPCT, CNTFRQ) | ✅ | Monotonic-ns API + busy-wait |
| 1 ms IRQ-driven tick | ✅ | CNTP_NS via PPI 14 (INTID 30) |
| Round-robin preemptive scheduler | ✅ | Up to 8 tasks, integer context only |
| NEON math (vec3, quat, mat4) | ✅ | Branch-free, FMA-based, fixed latency |
| FP/SIMD context switch | ❌ | v1.1 (tasks must be integer-only today) |
Cooperative yield (hepk_yield) |
❌ | v1.1 |
| W^X enforcement at the page-table level | ❌ | v1.1 (PT_LOAD intent already in linker) |
| RPi4 port (BCM2711 UART/GIC bases) | ❌ | v1.1 |
| SMP | ❌ | not on the near-term roadmap |
- GNU AArch64 bare-metal toolchain (
aarch64-none-elf-).
Grab from <https://developer.arm.com/Tools and Software/GNU Toolchain>; any 13.x or newer release works. - Either
makeor justbash(the project ships both). - For boot-testing:
qemu-system-aarch647.0+.
# Edit build.sh's TOOLCHAIN_BIN to point at your toolchain
bash build.sh
# Or, if you have make:
makeOutput:
hepk.elf— ELF for QEMU's-kerneland gdb.hepk.bin— flat binary for raw boot media.
qemu-system-aarch64 -M virt -cpu cortex-a72 -nographic -kernel hepk.elfExpected first ten seconds (counters on the last lines grow over time):
[HEPK] High Efficiency Protocol Kernel v0.1.0
[HEPK] EL1 vector table installed at VBAR_EL1.
[HEPK] MMU + I/D cache active (identity, 1GB blocks).
[HEPK] CNTFRQ_EL0 = 0x0000000003b9aca0 Hz
[HEPK] 100ms busy-wait delta = 0x00000000005f6b58 ticks
[HEPK] vec3_dot((1,2,3),(4,5,6)) = 0x0000000000000020 (expect 0x20)
[HEPK] mat4*vec4 lane0 bits = 0x000000003f800000 (expect 0x3F800000)
[HEPK] GICv2 distributor + CPU interface online.
[HEPK] CNTP_NS armed at 1ms; IRQ unmasked.
[HEPK] tick delta over 250ms = 0x00000000000000fa (expect 0xFA = 250)
[HEPK] scheduler started: 3 tasks (idle + work_a + work_b).
[HEPK] Orchestrator idle. Awaiting interrupts.
[HEPK] work_a counter = 0x000000000b62a502 work_b counter = 0x000000000ceefb04 ticks = 0x00000000000002ee
[HEPK] work_a counter = 0x0000000016e2076c work_b counter = 0x0000000019a5f1d1 ticks = 0x00000000000004e3
...
To exit QEMU: Ctrl-A x.
HEPK/
├── README.md CHANGELOG.md LICENSE
├── build.sh shell builder (portable)
├── Makefile classic builder (mirrors build.sh)
├── link.ld three PT_LOAD layout (RX / R / RW)
├── include/
│ ├── hepk.inc version, TCB layout, public-API index
│ ├── mmu.inc MAIR / TCR / SCTLR + block-descriptor recipes
│ ├── gic.inc GICv2 register offsets, INTIDs
│ └── neon_math.inc AAPCS64 contract for math primitives
└── src/
├── boot/boot.asm
├── drivers/
│ ├── uart.asm (PL011)
│ └── gic.asm (GICv2)
├── core/
│ ├── cpu.asm (FP/SIMD enable)
│ ├── cache.asm (DC ISW invalidate-all)
│ ├── mmu.asm (identity-map bring-up)
│ ├── vectors.asm(VBAR_EL1 + IRQ dispatch + panic)
│ ├── timer.asm (CNTP / busy-wait / 1 ms tick)
│ └── sched.asm (round-robin preemptive)
├── math/neon_math.asm
└── kernel.asm (boot orchestration + 3-task demo)
All routines follow AAPCS64. Pointer arguments in x0, x1, x2;
scalar return in x0 or s0.
// Time
hepk_timer_freq() -> x0
hepk_timer_ticks() -> x0 (CNTPCT_EL0)
hepk_timer_ns() -> x0 (monotonic ns)
hepk_timer_busy_wait_ns(x0 = ns)
hepk_tick_init(x0 = period_ns)
// Scheduling
hepk_task_create(x0=idx, x1=entry, x2=stack_top, x3=name_or_0)
hepk_sched_start(x0 = num_tasks)
hepk_tick_counter // global u64, IRQ-driven
// I/O
uart_init(); uart_putc(x0); uart_puts(x0); uart_put_hex64(x0)
// Math (see include/neon_math.inc for the full set)
vec3_add, vec3_sub, vec3_dot, vec3_cross, vec3_scale, vec3_norm_sq
quat_mul, mat4_mul, mat4_mul_vec4
Full surface and call contracts: include/hepk.inc and include/neon_math.inc.
HEPK is built with hard real-time targets in mind even though v1.0.0 itself is not certified. Patterns the codebase commits to:
- Branch-free, fixed-instruction-count primitives. Worst-case execution time equals typical-case for every NEON math routine.
- Single-walk MMU (1 GiB blocks). No translation-table walks beyond level 1 for any kernel address. TLB pressure is negligible.
- CVAL-anchored tick. The next tick deadline is
CVAL += period, so a delayed tick does not bleed jitter into the next period. - Inline IRQ context save. No stacked function call overhead between
vector entry and
irq_dispatch; the saved frame is a fixed 272 bytes.
Measured on QEMU virt + cortex-a72 (which itself is not a precise
hard-RT environment): 1 ms tick over 250 ms is 248–250 (mostly 249).
Real Cortex-A72 silicon should hit exactly 250 absent SError/SMP noise.
v1.1 (next)
- FP/SIMD context save/restore (Q0..Q31) — unblocks math-using tasks.
- Cooperative
hepk_yield()for sub-tick scheduling. - Per-section MMU attributes — actual W^X enforcement, kernel/user split.
- Raspberry Pi 4 port: BCM2711 UART/GIC base addresses behind a single
include/platform_*.incoverlay selected at build time. - Optional
-icountbuild target for repeatable QEMU timing.
Out-of-scope for the foreseeable future
- SMP, file systems, networking, dynamic loading, user-space ABI stability. HEPK is intentionally a kernel skeleton — bring your own application layer.
Issues and patches welcome. Requirements before merging:
- Builds cleanly under
aarch64-none-elf-13.x with no new warnings. - Boots in
qemu-system-aarch64 -M virt -cpu cortex-a72 -nographic -kernel hepk.elfand reaches the scheduler demo. - New primitives in
src/math/must be branch-free and fixed-cycle. - New IRQ paths must respect the "no
blbetween SP-save and SP-load inhepk_schedule" invariant.
MIT — see LICENSE. The disclaimer at the top of this README applies in addition to (not in place of) the MIT warranty disclaimer.