Skip to content

M1-16 Path B: BIOS real-mode I/O closes trampoline gap#1

Merged
UnbreakableMJ merged 10 commits into
mainfrom
m1-16-path-b
Apr 28, 2026
Merged

M1-16 Path B: BIOS real-mode I/O closes trampoline gap#1
UnbreakableMJ merged 10 commits into
mainfrom
m1-16-path-b

Conversation

@UnbreakableMJ

@UnbreakableMJ UnbreakableMJ commented Apr 28, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • M1-16 Path B: every BIOS interrupt call now runs in real
    mode before CR0.PE. Protected-mode kmain consumes a
    BootDataBundle at phys 0x1000 and never returns to real
    mode. The legacy call_bios_int 32→real→32 trampoline that
    returned AH=0x01 is gated behind the legacy_trampoline
    Cargo feature for regression comparison.
  • Re-enables bios-boot-smoke in the boot-smoke suite
    (deferred since db68d69 while M1-16 was incomplete). CI's
    qemu-smoke job now boots the real BIOS chain (stage1 MBR +
    stage2 + FAT32 partition) end-to-end and asserts the
    ZAMAK / LIMINE_PROTOCOL_OK serial sentinels alongside
    the existing UEFI case.
  • Closes the last `[~]` `boot-smoke` item in `TODO.md`
    (`155 → 156` done; `3 → 2` partial). Deletes the working
    design doc `M1-16-PATH-B.md`.

Known follow-up: `zamak_core::config::parse` infinite-loops
in the Path B i686 codegen even with the memset/memcpy
intrinsics fixed. Until that's diagnosed the BIOS path uses
a hard-coded `/kernel.elf` entry (TODO in `kmain`); UEFI is
unaffected.

Test plan

  • `qemu-smoke` CI job: `bios-boot-smoke` PASSes (new) and
    `uefi-boot-smoke` still PASSes.
  • `qemu-smoke` CI job: `asm-wrapper-state-check` +
    `uefi-linux-boot` still PASS (regression canaries).
  • `clippy` job clean against both default and
    `--features legacy_trampoline`.
  • `fmt` job clean.
  • `cargo test -p zamak-core` clean — new
    `ram_fat32::tests` 9 cases plus the existing 221.
  • `deny` / `audit` jobs clean.
  • `cross` job: AArch64 / RISC-V / LoongArch builds still
    clean.

Locally verified: all four QEMU cases PASS under QEMU 9.x in
a `nix-shell -p gcc -p mtools -p util-linux -p binutils -p
qemu -p OVMF` environment.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added FAT32 filesystem walker (zamak-core::ram_fat32) with long-filename support
  • Bug Fixes

    • Fixed BIOS memory operations to eliminate recursive calls
    • Fixed HHDM paging sizing to properly handle non-usable memory entries and cap at 16 GiB
  • Tests

    • Re-enabled BIOS boot end-to-end smoke testing
  • Chores

    • Introduced optional legacy_trampoline feature for backward compatibility
    • Restructured BIOS boot flow with improved BootDataBundle handoff to kernel

UnbreakableMJ and others added 10 commits April 25, 2026 02:31
Introduce the feature flag and split `entry.rs` so the `call_bios_int`
32→real→32 trampoline plus `esp_save_ptr` storage live behind
`#[cfg(feature = "legacy_trampoline")]`. The default build no longer
emits that asm block. Gate the matching `BiosRegs` struct, extern
declaration, and `disk` / `mmap` / `vbe` modules behind the same
feature.

A stub `kmain(bundle_phys: u32)` replaces the current legacy body on
the default path — drops 'K' on COM1 and halts. The legacy body is
renamed and gated; both builds (default + `--features
legacy_trampoline`) compile cleanly on nightly against the
`i686-zamak.json` target. UEFI regression canary still green.

Phase 0 of M1-16-PATH-B.md; no behavior change yet on the working
legacy path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New `boot_bundle` module defines the real-mode → kmain handoff
record: `BootDataBundle` at fixed phys `0x0000_1000`, magic
`ZBDL_MAGIC` (`0x4C44_425A` — design doc's `0x4C42_445A` would have
decomposed to "ZDBL"; the constant here spells "ZBDL" in LE bytes).
128 E820 entries, 4 KiB config buffer, VBE mode info, RSDP / SMBIOS
phys, kernel phys+len.

`E820Entry` and `VbeModeInfo` move here from `mmap` and `vbe` so both
legacy and Path B paths share the BIOS-layout records. Compile-time
asserts verify sizes, offsets, magic LE bytes, and that the bundle
stays under 12 KiB — well below the 0x4000 mark where the FAT32
bounce buffer will sit.

`bundle()` / `bundle_mut()` helpers (marked `#[allow(dead_code)]`
until Phase 5 / Phase 6) encapsulate the fixed-address cast. Host
tests live as `const _: () = { assert!(...) }` because the bin
crate can't host `#[cfg(test)]`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New `rm_io` module with six 16-bit asm wrappers linked into
`.entry` and callable by near-ret from the `_start` orchestration:

  rm_disk_read_ext   INT 13h AH=42h Extended Read
  rm_e820_next       INT 15h AX=E820h (one entry per call)
  rm_vbe_info        INT 10h AX=4F00h (controller info)
  rm_vbe_mode_info   INT 10h AX=4F01h (per-mode info)
  rm_vbe_set_mode    INT 10h AX=4F02h (activate mode)
  rm_outb_com1       outb to 0x3F8 for serial breadcrumbs

Each routine uses a register-based ABI documented at its header
rather than Rust's `extern "C"` cdecl — rustc on `i686-zamak`
doesn't emit 16-bit-safe prologues, so the orchestration calls them
from asm with matching conventions.

Objdump of the built binary confirms each routine decodes as
16-bit code with the correct BIOS opcode sequence. The bare `ret`
mnemonic under Intel-syntax `.code16` produces `66 c3` (32-bit
retd) — documented in the module header; each routine ends with
an explicit `.byte 0xC3` instead so real-mode stack discipline
is preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extend `rm_io` with two more 16-bit wrappers:

  rm_unreal_enter     one-shot PE-on → load flat FS → PE-off, leaving
                      FS with a 4 GiB descriptor cache after the CPU
                      is back in real mode.
  rm_memcpy_to_high   rep movsb to a 32-bit destination > 1 MiB,
                      achieved by transiently flipping ES to flat for
                      the duration of the copy and restoring real-mode
                      ES via push/pop around it.

Neither routine touches DS, so subsequent INT 13h / 15h / 10h calls
against `[DS:SI]` still see the caller-expected real-mode descriptor.
The existing GDT's selector 0x10 is reused as the flat 32-bit data
descriptor — no new table.

The destination-side flat cache is populated by a separate PE round
trip inside `rm_memcpy_to_high` rather than a `fs:` prefix on movsb,
because x86 segment overrides affect only the source (DS), not the
destination (ES).

Objdump confirms the Phase 3 bytes: `0f 01 16 bc 80` lgdt, `0f 22 c0`
mov cr0 eax, `67 f3 a4` addr32 rep movsb.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Simplify the Phase 4 delivery from a full 16-bit-asm FAT32 walker to
a bulk partition preload + protected-mode parse. The `_start`
orchestration will use `rm_load_chunk` to pull ≤16 sectors per
BIOS call into the bounce buffer at phys 0x5000 and then
`rm_memcpy_to_high` into a linear destination; kmain unpacks the
resulting byte slice via `RamDisk`, which adapts `BlockDevice` over
plain memory so the existing `Fat32` parser in `zamak-bios/src/fat32.rs`
runs unchanged.

`rm_load_chunk` takes DL=drive, EBX=LBA, AX=count (1..=16),
EDI=dest, returns AL=0 on success or the BIOS error code. The outer
loop (sector count, advancement) lives in the Phase 5 orchestration
so the asm stays short and stack-discipline simple.

`BootDataBundle` grows `partition_image_phys` + `partition_image_len`
(both u64) so kmain knows where the preload landed.

`ram_disk::RamDisk` returns `Error::IoError` on over-read rather than
padding with zeros, preventing the FAT32 walker from misreading a
truncated directory cluster as containing a free-slot marker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The orchestration in `rm_phaseb_orchestrate` populates the
BootDataBundle at phys 0x1000 before the CR0.PE transition: zero
the bundle region, enter unreal mode, walk INT 15h E820 into
`bundle.e820[..count]`, INT 13h Extended Read the MBR, scan the
partition table for a FAT32 / Linux entry, bulk-load 8 MiB of
the boot partition into phys 0x0200_0000 via repeated
rm_load_chunk calls, scan the BIOS vendor region for the RSDP,
and stamp ZBDL_MAGIC last so kmain detects partial init.

`_start` now saves DL into [0x0401] **before** the 'Z'
breadcrumb's `mov dx, 0x3F8` overwrites it, then calls the
orchestration as a 16-bit near call (raw `0xE8 + .word rel16`
bytes — Intel-syntax `call <symbol>` under `.code16` inside
rustc's global_asm! emits a 32-bit `calll`, mismatched against
the routines' 16-bit `ret` and corrupting SS:SP per call).
`init_32` pushes 0x1000 (the bundle phys) instead of the drive
ID and hands it to kmain.

Live QEMU smoke green up to the stub kmain: serial trace
`ZUEMLlRkPK` confirms every orchestration phase runs and the
PE→PM transition reaches kmain through the bundle pointer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ramework

Replace the stub kmain with a Path-B kmain that consumes the
`BootDataBundle` populated by the real-mode orchestration: validate
ZBDL magic, convert the BIOS E820 entries into Limine `MemmapEntry`
records, and emit a recognizable checkpoint breadcrumb. Adds an
explicit `cli` at both `init_32` and kmain entry — without an IDT
loaded, any hardware IRQ (PIT timer at vector 0x08) triple-faults.

QEMU smoke green through the Phase-6 checkpoint: serial trace
`ZUEMLlRkPKBE.` confirms every prior phase plus the bundle handoff
and the E820 → Limine conversion. The FAT32 parse + kernel load
+ Limine fulfillment + long-mode entry are deferred to a follow-up
commit; first pass hangs in the inlined `Fat32::parse` path under
nightly's i686 codegen and needs a different parser layout (likely
a non-trait-object FAT32 walker that operates directly on the
partition image bytes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires the BIOS Path B kmain end-to-end: parses zamak.conf out of the
pre-loaded partition image with a new concrete-type FAT32 walker, loads
the kernel ELF segments into a fixed phys window, fulfills the Limine
requests, builds the high-half page tables, and jumps into long mode.
QEMU smoke prints "ZAMAK\nLIMINE_PROTOCOL_OK", confirming the boot-smoke
test kernel runs.

* zamak-core::ram_fat32 — non-trait-object FAT32 walker over &[u8] with
  LFN reassembly. 9/9 host unit tests; replaces the trait-object
  Fat32 path that hung in nightly i686 codegen.
* zamak-bios::utils — replaces the Rust-level memset/memcpy bodies
  (which lowered back into themselves and recursed forever) with raw
  rep stos / rep movs blocks.
* zamak-bios::paging — caps HHDM at 16 GiB and ignores non-USABLE
  e820 entries so QEMU's 1 TiB MMIO hole doesn't blow the bump heap.
* zamak-bios::kmain (Path B) — end-to-end pipeline including a
  hand-rolled minimal ELF64 parser (goblin's Elf::parse hung in this
  codegen) and an explicit segment copy to KERNEL_LOAD_PHYS so the
  kernel runs from its declared virtual base. Stores info.entry at
  the well-known 0x5FF0 scratch the init_64 stub reads.

Stage 2 binary shrunk from 122 → 25 sectors (goblin + config::parse
removed from the BIOS path). config::parse still hangs separately in
this codegen and is wired with a hard-coded /kernel.elf entry until
that's diagnosed (TODO in kmain).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…smoke

Now that the Path B kmain runs end-to-end, the BIOS boot path no
longer needs the trait-object FAT32 stack at default-feature time.

* Re-enables `bios-boot-smoke` in `zamak-test`'s `boot-smoke` suite;
  it was deferred in db68d69 while M1-16 was incomplete. Local QEMU
  passes both `bios-boot-smoke` and `uefi-boot-smoke`.
* Gates `fat32` and `input` modules behind `legacy_trampoline` —
  Path B reads from a `RamFat32` walker over the bundle's partition
  image and skips the BIOS-keyboard TUI for now.
* Gates `find_rsdp` behind `legacy_trampoline`. Path B kmain reads
  `bundle.rsdp_phys`, populated by `rm_phaseb_orchestrate` in real
  mode.
* Deletes the unused `ram_disk` adapter — superseded by the
  byte-slice walker.

`cargo build -p zamak-bios --features legacy_trampoline` still
compiles clean against -D warnings, so the legacy path remains
available for regression comparisons.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Path B is green: bios-boot-smoke + uefi-boot-smoke both pass under
QEMU. The trampoline path is gated behind legacy_trampoline, the
real-mode orchestration owns every BIOS call, and the kernel
runs end-to-end through long mode.

* TODO.md: flip M1-16 [~] → [✓]; partial count 3 → 2; total 155 → 156.
  Update TEST-5 note now that the BIOS leg of boot-smoke is back.
* CHANGELOG.md: [Unreleased] entry covering the Path B refactor,
  the memset/memcpy recursion fix, the HHDM-sizing cap, and the
  bios-boot-smoke re-enable.
* Delete M1-16-PATH-B.md — the working design doc has now been
  executed; superseded by the commits on this branch and the
  follow-up notes in TODO.md / CHANGELOG.md.

The two remaining [~] items (M6-1 LoongArch UEFI target, M6-3 hardware
perf baseline) are both blocked outside the workspace (rustc upstream
and bare-metal hardware respectively).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 28, 2026

Copy link
Copy Markdown

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This pull request implements the M1-16 Path B BIOS restructuring, transitioning the boot flow to perform all BIOS I/O entirely in real mode before entering protected mode, with results passed to kmain via a BootDataBundle at fixed physical address 0x1000. Includes new real-mode orchestration routines, FAT32 filesystem walker, and inline-assembly memory utilities. Legacy trampoline now gated by optional legacy_trampoline feature.

Changes

Cohort / File(s) Summary
Documentation
CHANGELOG.md, TODO.md, M1-16-PATH-B.md
Recorded completion of M1-16 Path B BIOS restructuring, updated test coverage status, and deleted the now-implemented design document.
Build Configuration
zamak-bios/Cargo.toml
Added optional legacy_trampoline Cargo feature to conditionally include legacy BIOS trampoline code.
Boot Data Structure
zamak-bios/src/boot_bundle.rs
New ABI-focused module defining BootDataBundle layout at physical 0x1000 with packed E820Entry, VbeModeInfo, and bundle structs, plus unsafe accessors and compile-time layout assertions.
Real-Mode BIOS Orchestration
zamak-bios/src/rm_io.rs
New 1000+ line module embedding 16-bit assembly wrappers for BIOS interrupts (INT 13h extended disk read, INT 15h E820 enumeration, INT 10h VBE modes), unreal-mode transitions, and phase B orchestration that populates BootDataBundle at 0x1000.
Boot Entry Point
zamak-bios/src/entry.rs
Refactored to call real-mode rm_phaseb_orchestrate before entering protected mode; kmain now receives BootDataBundle physical address instead of boot drive ID; legacy call_bios_int trampoline moved to conditional legacy_trampoline feature block.
Kernel Entry Logic
zamak-bios/src/main.rs
New non-legacy kmain(bundle_phys: u32) path consuming BootDataBundle, performing E820→Limine conversion, mounting FAT32, loading kernel ELF via custom minimal parser, fulfilling Limine requests, setting up paging, and jumping to long mode. Conditional compilation gates legacy trampoline dependencies (disk, mmap, vbe, fat32, input).
FAT32 Filesystem Walker
zamak-core/src/ram_fat32.rs
New borrowed-slice FAT32 parser supporting directory traversal, VFAT long-filename reconstruction with case-insensitive matching, and cluster-chain file reads, with synthesized test images validating parsing and edge cases.
Shared Data Structures
zamak-bios/src/mmap.rs, zamak-bios/src/vbe.rs
Refactored to import E820Entry and VbeModeInfo from crate::boot_bundle instead of local definitions, eliminating duplication and ensuring consistent packed layout across boot phases.
Memory Utilities
zamak-bios/src/utils.rs
Replaced Rust pointer-based memset/memcpy/memmove with explicit inline rep assembly instructions to avoid recursive self-calls; updated memcmp to indexed loop to prevent libcall lowering.
Paging Setup
zamak-bios/src/paging.rs
Fixed HHDM sizing to derive limit from highest usable E820 entries only (ignoring reserved/ACPI/MMIO) and capped at new HHDM_MAX_BYTES (16 GiB) to constrain page-table allocation.
FAT32 Cleanup
zamak-bios/src/fat32.rs
Removed extraneous whitespace in Fat32::parse.
Module Exports
zamak-bios/src/main.rs, zamak-core/src/lib.rs
Added pub mod boot_bundle and pub mod rm_io to BIOS exports; added pub mod ram_fat32 to core exports; added conditional feature-gated module declarations.
CI Test Activation
zamak-test/src/main.rs
Re-enabled bios-boot-smoke end-to-end test case, validating BIOS boot chain against same serial sentinel patterns (ZAMAK, LIMINE_PROTOCOL_OK) as UEFI path.

Sequence Diagram(s)

sequenceDiagram
    participant RM as Real Mode
    participant BIOS as BIOS Interrupts
    participant BD as BootDataBundle<br/>(0x1000)
    participant MEM as System Memory<br/>(E820)
    participant DISK as Disk Controller
    
    RM->>RM: _start: preserve boot drive<br/>enable interrupts
    RM->>RM: Call rm_phaseb_orchestrate
    
    RM->>BIOS: INT 15h AX=E820h<br/>(enumerate memory)
    BIOS-->>RM: E820 entries
    RM->>BD: Store E820 table
    
    RM->>DISK: INT 13h AH=42h<br/>(read MBR)
    DISK-->>RM: MBR sector
    RM->>RM: Scan partition table<br/>for FAT32/Linux
    
    RM->>RM: Enter unreal mode<br/>(flat addressing)
    RM->>DISK: INT 13h AH=42h<br/>(chunk reads)
    DISK-->>RM: Kernel partition image
    RM->>MEM: Copy to high memory<br/>(via rm_memcpy_to_high)
    
    RM->>BIOS: INT 15h AX=E820h<br/>(scan for RSDP)
    RM->>BD: Store RSDP pointer
    
    RM->>BD: Stamp ZBDL_MAGIC last
    RM->>RM: Disable interrupts<br/>Enter protected mode
    RM-->>RM: Return (0x1000 ready)
Loading
sequenceDiagram
    participant PM as Protected Mode
    participant Bundle as BootDataBundle<br/>(0x1000)
    participant FS as FAT32 Walker<br/>(ram_fat32)
    participant ELF as ELF Loader
    participant MM as Paging System
    participant LM as Long Mode
    
    PM->>PM: init_32: set stack (0x8000)<br/>call kmain(bundle_phys=0x1000)
    PM->>Bundle: Read BootDataBundle
    Bundle-->>PM: E820 entries, boot drive,<br/>kernel partition loaded
    
    PM->>FS: Parse FAT32 boot sector<br/>(from loaded partition)
    FS-->>PM: FAT32 metadata
    PM->>FS: find_path("/kernel.elf")
    FS-->>PM: Directory entry facts
    PM->>FS: read_file() into fixed window<br/>(0x2000000)
    FS-->>PM: Kernel ELF buffer
    
    PM->>ELF: Parse ELF64 header
    ELF-->>PM: PT_LOAD segments
    PM->>ELF: Copy segments to load window
    ELF-->>PM: Kernel image ready
    
    PM->>PM: Scan ELF for Limine requests
    PM->>MM: Fulfill Limine requests<br/>(memory map, framebuffer, etc.)
    PM->>MM: Build identity/HHDM paging
    MM-->>PM: Page tables set
    
    PM->>PM: Store long-mode entry (0x5FF0)
    PM->>LM: Call enter_long_mode
    LM-->>PM: init_64 (64-bit kernel exec)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Huzzah! The Path B bunny hops,
Real-mode BIOS never stops,
Boot data bundled, 0x1000 deep,
FAT32 walks where clusters leap,
No trampoline—just pure, swift flow! 🎯

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: implementing M1-16 Path B architecture where BIOS real-mode I/O runs before protected mode, closing the gap left by the legacy trampoline.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch m1-16-path-b

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@UnbreakableMJ UnbreakableMJ merged commit b8899fb into main Apr 28, 2026
10 of 11 checks passed
UnbreakableMJ added a commit that referenced this pull request Apr 28, 2026
Re-flows trailing comments and wraps the CI fmt job flagged on PR #1.
No semantic changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant