Skip to content

Add initital H extension support (Without Full AIA support)#120

Open
inochisa wants to merge 96 commits into
SpinalHDL:devfrom
project-inochi:ext-h
Open

Add initital H extension support (Without Full AIA support)#120
inochisa wants to merge 96 commits into
SpinalHDL:devfrom
project-inochi:ext-h

Conversation

@inochisa
Copy link
Copy Markdown
Contributor

@inochisa inochisa commented Feb 21, 2026

Add basic support for H extension.

This PR includes the following part:

  1. introduce a new method to remap the VS-stage S-mode CSR access to VS-stage, also add necessary api for HS-mode permission check
  2. Add basic hypervisor CSR support.
  3. Adjust TrapPlugin to support VS-stage interrupt.
  4. Adapt the sstc/smcsrind/sscsrind to hypervisor extension
  5. G-stage MMU support

This PR does not include:

  1. G-stage MMU support.
  2. Full AIA extension support for H extension.

Edit:
20260316: Add G-stage MMU support

@inochisa inochisa force-pushed the ext-h branch 2 times, most recently from e377e14 to 6fba459 Compare March 2, 2026 04:47
@looongbin looongbin force-pushed the ext-h branch 2 times, most recently from d980e58 to 831055e Compare March 4, 2026 04:02
@looongbin looongbin force-pushed the ext-h branch 2 times, most recently from a51f9e8 to acf2678 Compare March 5, 2026 07:17
@inochisa inochisa marked this pull request as draft March 9, 2026 02:57
@inochisa inochisa changed the title Add initital H extension support (Without G-stage MMU/Full AIA support) Add initital H extension support (Without Full AIA support) Mar 16, 2026
@inochisa inochisa force-pushed the ext-h branch 2 times, most recently from 26722f4 to 263a799 Compare March 19, 2026 09:32
@inochisa inochisa marked this pull request as ready for review March 19, 2026 09:34
@inochisa
Copy link
Copy Markdown
Contributor Author

inochisa commented Mar 19, 2026

The impact of this PR (1 core with PLIC)

Compile options: --build --update-repo no --cpu-type vexiiriscv --cpu-variant debian --with-coherent-dma --no-netlist-cache --with-ethernet --libc-mode full The H extension is enabled by --with-isa h

LUTs

This are compiled on XCKU5P

Ref LUTs
HEAD 17533
H enabled 19687
H disabled 18075

Performance

Coremark:

Ref Iterations/Sec
HEAD 208.130984
H disabled 207.569363
H enabled (HS) 186.811134
H enabled (VS) 136.425648

It seems like about 10% performance from this PR with H extension enabled , and this mainly comes from a new pipeline stage in both LSU and Fetch.

Edit: The performance impact on S mode is removed by only inserting the stage-2 translation with extension enabled.
Edit: Add VS mode coremark score.

@inochisa
Copy link
Copy Markdown
Contributor Author

Hi, @Dolu1990

Currently, I think this PR is worth to review now. At least It can boot Linux and run a simple bare KVM program (I still try to setup an rootfs so the Qemu can be used to boot a linux in KVM mode). It may still have some small bugs but there should be no big change anymore.

@Dolu1990
Copy link
Copy Markdown
Member

Hi,

gg ^^

For the performance, my guess is that in a ASIC it would be totaly fine to chain both mmu translation in a single cycle. So FPGA degradation only.

Two things i can see :

That way we can run the regressions tests for configs which have H support ^^
Idealy, this should include booting linux in linux in a simulation. Doing it in simulation would allow to catch any spec missmatch against spike which is exercised by the whole process.
Also, this would allow to do changes without too much fear of breaking things without noticing.

@Dolu1990
Copy link
Copy Markdown
Member

Note, coremark is quite sensitive to LSU latency, as it does quite a bit of linked list pointer chassing.

@inochisa
Copy link
Copy Markdown
Contributor Author

inochisa commented Mar 25, 2026

Hi,

gg ^^

For the performance, my guess is that in a ASIC it would be totaly fine to chain both mmu translation in a single cycle. So FPGA degradation only.

Two things i can see :

* src/main/scala/vexiiriscv/Param.scala has the pipeline stages  + 1 hard coded

Yes, it is hard coded now. However, for non-H mode build, the new stage could be emitted. As the stpk in Fetch/LSU is damaged and not accessed at all in this scene. However, it is ugly for using something like if (withRvh) X + 1 else X for the parameters. So I guess we may need to define some template for it.

* Would need to integrate regressions tests for H support in https://github.com/SpinalHDL/VexiiRiscv/blob/dev/src/test/scala/vexiiriscv/tester/RegressionSingle.scala

* Would then need to specify how to generate random H generation parmeters in  https://github.com/SpinalHDL/VexiiRiscv/blob/dev/src/test/scala/vexiiriscv/tester/Regression.scala

That way we can run the regressions tests for configs which have H support ^^

Yes, that's true. It is fine for me to add it.

Idealy, this should include booting linux in linux in a simulation.

Linux could be a problem, not for a synthesized core, but for the simulation. In fact, I managed to boot an mainline OpenSBI with the simulation. And it gave me a 1.1GB trace log and a 5.1 GB fst wave. XD

And at least for now, my team are struggled to boot a linux with rootfs. Since my board has no SD card and I have to boot the kernel with ethernet.

Doing it in simulation would allow to catch any spec missmatch against spike which is exercised by the whole process. Also, this would allow to do changes without too much fear of breaking things without noticing.

My suggestion is adding a batch of unit tests like defermelowie/riscv-hext-asm-tests (And I just use this to test the implementation). So we can test all the spec requirements with both log and wave. (By this way, I think the spike for Vexiiriscv is too old to support H extension. XD)

With lots of unit test are required, I think a small framework and some common code are needed to invoke the unit test automatically. Now I used TestBenchServer for this, but it is hard to capture the test state. Anyway, this is more like the thing in the future.

@inochisa
Copy link
Copy Markdown
Contributor Author

And a good news, The kvm unit test has passed.

BUILD_HEAD=86e53277
timeout -k 1s --foreground 90s /usr/bin/lkvm run --kernel /tmp/tmp.nxeSLOa1c6 --cpus 1 --nodefaults --network mode=none --loglevel=warning # --initrd /tmp/tmp.YMA4zI7XFd

##########################################################################
#    kvm-unit-tests
##########################################################################

PASS: sbi: base: Bad FID: SBI_ERR_NOT_SUPPORTED
SKIP: sbi: base: spec_version: missing SBI_SPEC_VERSION environment variable
SKIP: sbi: base: impl_id: missing SBI_IMPL_ID environment variable
SKIP: sbi: base: impl_version: missing SBI_IMPL_VERSION environment variable
PASS: sbi: base: probe_ext: check sbi.error and sbi.value
PASS: sbi: base: probe_ext: unavailable: check sbi.error and sbi.value
SKIP: sbi: base: mvendorid: missing MVENDORID environment variable
SKIP: sbi: base: marchid: missing MARCHID environment variable
SKIP: sbi: base: mimpid: missing MIMPID environment variable
PASS: sbi: time: Bad FID: SBI_ERR_NOT_SUPPORTED
PASS: sbi: time: set_timer: set timer
PASS: sbi: time: set_timer: timer interrupt received
PASS: sbi: time: set_timer: pending timer interrupt bit set in irq handler
PASS: sbi: time: set_timer: pending timer interrupt bit cleared by setting timer to -1
PASS: sbi: time: set_timer: timer delay honored
PASS: sbi: time: set_timer: timer interrupt received exactly once
PASS: sbi: time: set_timer: set timer for mask irq test
PASS: sbi: time: set_timer: timer interrupt received for mask irq test
PASS: sbi: time: set_timer: pending timer interrupt bit set in irq handler for mask irq test
PASS: sbi: time: set_timer: timer delay honored for mask irq test
PASS: sbi: time: set_timer: timer interrupt received exactly once for mask irq test
PASS: sbi: time: set_timer: timer immediately pending by setting timer to 0
PASS: sbi: time: set_timer: pending timer cleared while masked
PASS: sbi: ipi: Bad FID: SBI_ERR_NOT_SUPPORTED
SKIP: sbi: ipi: At least 2 cpus required
PASS: sbi: hsm: Bad FID: SBI_ERR_NOT_SUPPORTED
PASS: sbi: hsm: hart_get_status: status of current hart is started
SKIP: sbi: hsm: no other cpus to run the remaining hsm tests on
PASS: sbi: dbcn: Bad FID: SBI_ERR_NOT_SUPPORTED
DBCN_WRITE_TEST_STRING
PASS: sbi: dbcn: write: write success (error=0)
INFO: sbi: dbcn: write: 1 sbi calls made
DBCN_WRITE_TEST_STRING
PASS: sbi: dbcn: write: page boundary: write success (error=0)
INFO: sbi: dbcn: write: page boundary: 1 sbi calls made
SKIP: sbi: dbcn: write: high boundary: Memory above 4G required
SKIP: sbi: dbcn: write: high page: Memory above 4G required
SKIP: sbi: dbcn: write: invalid parameter: missing INVALID_ADDR environment variable
DBCN_WRITE_BYTE TEST BYTE: a
PASS: sbi: dbcn: write_byte: write success (error=0)
PASS: sbi: dbcn: write_byte: expected ret.value (0)
DBCN_WRITE_BYTE TEST WORD: a
PASS: sbi: dbcn: write_byte: write success (error=0)
PASS: sbi: dbcn: write_byte: expected ret.value (0)
PASS: sbi: susp: Bad FID: SBI_ERR_NOT_SUPPORTED
PASS: sbi: susp: funcid != 0 not supported
PASS: sbi: susp: basic: suspend and resume
PASS: sbi: susp: sleep_type: got expected sbi.error (-3)
PASS: sbi: susp: sleep_type upper bits: suspend and resume
SKIP: sbi: susp: bad addr: missing INVALID_ADDR environment variable
SKIP: sbi: susp: one cpu online: At least 2 cpus required
SKIP: sbi: sse: extension not available
PASS: sbi: fwft: Bad FID: SBI_ERR_NOT_SUPPORTED
PASS: sbi: fwft: base: get reserved feature 0x6: SBI_ERR_DENIED
PASS: sbi: fwft: base: set reserved feature 0x6: SBI_ERR_DENIED
PASS: sbi: fwft: base: get reserved feature 0x3fffffff: SBI_ERR_DENIED
PASS: sbi: fwft: base: set reserved feature 0x3fffffff: SBI_ERR_DENIED
PASS: sbi: fwft: base: get reserved feature 0x80000000: SBI_ERR_DENIED
PASS: sbi: fwft: base: set reserved feature 0x80000000: SBI_ERR_DENIED
PASS: sbi: fwft: base: get reserved feature 0xbfffffff: SBI_ERR_DENIED
PASS: sbi: fwft: base: set reserved feature 0xbfffffff: SBI_ERR_DENIED
PASS: sbi: fwft: misaligned_exc_deleg: Get misaligned deleg feature: SBI_SUCCESS
SKIP: sbi: fwft: misaligned_exc_deleg: missing MISALIGNED_EXC_DELEG_RESET environment variable
PASS: sbi: fwft: misaligned_exc_deleg: Set misaligned deleg feature invalid value 2: SBI_ERR_INVALID_PARAM
PASS: sbi: fwft: misaligned_exc_deleg: Set misaligned deleg feature invalid value 0xFFFFFFFF: SBI_ERR_INVALID_PARAM
PASS: sbi: fwft: misaligned_exc_deleg: Set misaligned deleg with invalid value > 32bits: SBI_ERR_INVALID_PARAM
PASS: sbi: fwft: misaligned_exc_deleg: Set misaligned deleg with invalid flag > 32bits: SBI_ERR_INVALID_PARAM
PASS: sbi: fwft: misaligned_exc_deleg: set to 0: SBI_SUCCESS
PASS: sbi: fwft: misaligned_exc_deleg: get 0 after set
PASS: sbi: fwft: misaligned_exc_deleg: set to 1: SBI_SUCCESS
PASS: sbi: fwft: misaligned_exc_deleg: get 1 after set
SKIP: sbi: fwft: misaligned_exc_deleg: Misaligned load exception does not trap in S-mode
PASS: sbi: fwft: misaligned_exc_deleg: Set misaligned deleg feature value 0 and lock: SBI_SUCCESS
PASS: sbi: fwft: misaligned_exc_deleg: locked: Set to 0 without lock flag: SBI_ERR_DENIED_LOCKED
PASS: sbi: fwft: misaligned_exc_deleg: locked: Set to 0 with lock flag: SBI_ERR_DENIED_LOCKED
PASS: sbi: fwft: misaligned_exc_deleg: locked: Set to 1 without lock flag: SBI_ERR_DENIED_LOCKED
PASS: sbi: fwft: misaligned_exc_deleg: locked: Set to 1 with lock flag: SBI_ERR_DENIED_LOCKED
PASS: sbi: fwft: misaligned_exc_deleg: locked: Get value 0
SKIP: sbi: fwft: pte_ad_hw_updating: not supported by platform
SKIP: sbi: fwft: dbtr: extension not available
SUMMARY: 74 tests, 18 skipped

EXIT: STATUS=1
PASS sbi (74 tests, 18 skipped)

@inochisa
Copy link
Copy Markdown
Contributor Author

inochisa commented Mar 27, 2026

Hi, @Dolu1990 , now we have linux in kvm virtual machine

Command

qemu-system-riscv64 \
  -nographic --enable-kvm -machine virt \
  -m 512M -smp cpus=1 -cpu host \
  -kernel /root/Image \
  -initrd /root/initramfs.cpio \
  -append "console=ttyS0,115200n8 earlycon ignore_loglevel"

Boot log

[    0.000000] Booting Linux on hartid 0
[    0.000000] Linux version 7.0.0-rc3-dirty (riscv64-linux-gnu-gcc (GCC) 15.1.0, GNU ld (GNU Binutils) 2.44) #48 SMP PREEMPT Wed Mar 25 10:53:17 CST 2026
[    0.000000] random: crng init done
[    0.000000] Machine model: riscv-virtio,qemu
[    0.000000] SBI specification v3.0 detected
[    0.000000] SBI implementation ID=0x3 Version=0x70000
[    0.000000] SBI TIME extension detected
[    0.000000] SBI IPI extension detected
[    0.000000] SBI RFENCE extension detected
[    0.000000] SBI SRST extension detected
[    0.000000] SBI DBCN extension detected
[    0.000000] SBI FWFT extension detected
[    0.000000] printk: debug: ignoring loglevel setting.
[    0.000000] efi: UEFI not found.
[    0.000000] earlycon: ns16550a0 at MMIO 0x0000000010000000 (options '')
[    0.000000] printk: legacy bootconsole [ns16550a0] enabled
[    0.000000] OF: reserved mem: Reserved memory: No reserved-memory node in the DT
[    0.000000] SBI HSM extension detected
[    0.000000] riscv: base ISA extensions acdfim
[    0.000000] riscv: ELF capabilities acdfim
[    0.000000] Ticket spinlock: enabled
[    0.000000] Zone ranges:
[    0.000000]   DMA32    [mem 0x0000000080000000-0x000000009fffffff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000080000000-0x000000009fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x000000009fffffff]
[    0.000000] percpu: Embedded 31 pages/cpu s88280 r8192 d30504 u126976
[    0.000000] pcpu-alloc: s88280 r8192 d30504 u126976 alloc=31*4096
[    0.000000] pcpu-alloc: [0] 0 
[    0.000000] Kernel command line: console=ttyS0,115200n8 earlycon ignore_loglevel
[    0.000000] printk: log buffer data + meta data: 131072 + 458752 = 589824 bytes
[    0.000000] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
[    0.000000] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes, linear)
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 131072
[    0.000000] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] rcu:     RCU event tracing is enabled.
[    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=1.
[    0.000000]  Trampoline variant of Tasks RCU enabled.
[    0.000000]  Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.000000] RCU Tasks: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1 rcu_task_cpu_ids=1.
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] riscv-intc: 64 local interrupts mapped
[    0.000000] riscv: providing IPIs using SBI IPI extension
[    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.000000] clocksource: riscv_clocksource: mask: 0xffffffffffffffff max_cycles: 0x171024e7e0, max_idle_ns: 440795205315 ns
[    0.000041] sched_clock: 64 bits at 100MHz, resolution 10ns, wraps every 4398046511100ns
[    0.615826] Console: colour dummy device 80x25
[    0.931881] Calibrating delay loop (skipped), value calculated using timer frequency.. 200.00 BogoMIPS (lpj=400000)
[    1.676157] pid_max: default: 32768 minimum: 301
[    2.069036] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes, linear)
[    2.573399] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes, linear)
[    3.255621] VFS: Finished mounting rootfs on nullfs
[    4.047842] riscv: ELF compat mode unsupported
[    4.048055] ASID allocator disabled (0 bits)
[    4.688486] rcu: Hierarchical SRCU implementation.
[    5.020970] rcu:     Max phase no-delay instances is 1000.
[    5.467583] EFI services will not be available.
[    5.804331] smp: Bringing up secondary CPUs ...
[    6.125065] smp: Brought up 1 node, 1 CPU
[    6.421377] Memory: 424740K/524288K available (11777K kernel code, 5982K rwdata, 10240K rodata, 2439K init, 389K bss, 96548K reserved, 0K cma-reserved)
[    7.437406] devtmpfs: initialized
[    7.896456] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    8.567706] posixtimers hash table entries: 512 (order: 1, 8192 bytes, linear)
[    9.103648] futex hash table entries: 256 (16384 bytes on 1 NUMA nodes, total 16 KiB, linear).
[    9.852842] DMI not present or invalid.
[   10.205903] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[   10.657680] DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
[   11.237146] audit: initializing netlink subsys (disabled)
[   11.659839] audit: type=2000 audit(0.748:1): state=initialized audit_enabled=0 res=1
[   12.273113] thermal_sys: Registered thermal governor 'step_wise'
[   12.280300] cpuidle: using governor menu
[   13.028902] SBI misaligned access exception delegation ok
[   13.888321] HugeTLB: registered 2.00 MiB page size, pre-allocated 0 pages
[   14.863779] HugeTLB: 28 KiB vmemmap can be freed for a 2.00 MiB page
[   16.061318] ACPI: Interpreter disabled.
[   16.629115] iommu: Default domain type: Translated
[   16.970005] iommu: DMA domain TLB invalidation policy: strict mode
[   18.312389] SCSI subsystem initialized
[   18.874083] libata version 3.00 loaded.
[   19.497546] usbcore: registered new interface driver usbfs
[   20.313994] usbcore: registered new interface driver hub
[   20.677283] usbcore: registered new device driver usb
[   21.961094] Advanced Linux Sound Architecture Driver Initialized.
[   23.169633] vgaarb: loaded
[   23.582227] clocksource: Switched to clocksource riscv_clocksource
[   24.957778] pnp: PnP ACPI: disabled
[   27.460399] NET: Registered PF_INET protocol family
[   29.124482] IP idents hash table entries: 8192 (order: 4, 65536 bytes, linear)
[   34.957448] tcp_listen_portaddr_hash hash table entries: 256 (order: 0, 4096 bytes, linear)
[   35.636269] Table-perturb hash table entries: 65536 (order: 6, 262144 bytes, linear)
[   35.953513] TCP established hash table entries: 4096 (order: 3, 32768 bytes, linear)
[   36.889176] TCP bind hash table entries: 4096 (order: 5, 131072 bytes, linear)
[   37.511411] TCP: Hash tables configured (established 4096 bind 4096)
[   38.048264] UDP hash table entries: 256 (order: 2, 16384 bytes, linear)
[   39.150022] UDP-Lite hash table entries: 256 (order: 2, 16384 bytes, linear)
[   40.409102] NET: Registered PF_UNIX/PF_LOCAL protocol family
[   41.368940] RPC: Registered named UNIX socket transport module.
[   42.232025] RPC: Registered udp transport module.
[   42.576651] RPC: Registered tcp transport module.
[   43.672180] RPC: Registered tcp-with-tls transport module.
[   44.064092] RPC: Registered tcp NFSv4.1 backchannel transport module.
[   45.559946] PCI: CLS 0 bytes, default 64
[   45.880960] kvm [1]: hypervisor extension not available
[   46.116329] Unpacking initramfs...
[   46.684292] workingset: timestamp_bits=46 max_order=17 bucket_order=0
[   47.649144] NFS: Registering the id_resolver key type
[   48.235915] Key type id_resolver registered
[   48.400784] Key type id_legacy registered
[   49.212755] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
[   50.144314] nfs4flexfilelayout_init: NFSv4 Flexfile Layout Driver Registering...
[   51.083896] 9p: Installing v9fs 9p2000 file system support
[   51.764965] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 244)
[   52.736528] io scheduler mq-deadline registered
[   52.921565] io scheduler kyber registered
[   53.803327] io scheduler bfq registered
[   54.743506] riscv-plic: plic@c000000: mapped 96 interrupts with 1 handlers for 1 contexts.
[   55.899943] pci-host-generic 30000000.pci: host bridge /soc/pci@30000000 ranges:
[   56.864249] pci-host-generic 30000000.pci:       IO 0x0003000000..0x000300ffff -> 0x0000000000
[   58.788218] pci-host-generic 30000000.pci:      MEM 0x0040000000..0x007fffffff -> 0x0040000000
[   59.132185] pci-host-generic 30000000.pci:      MEM 0x0400000000..0x07ffffffff -> 0x0400000000
[   60.843596] pci-host-generic 30000000.pci: Memory resource size exceeds max for 32 bits
[   61.841066] pci-host-generic 30000000.pci: ECAM at [mem 0x30000000-0x3fffffff] for [bus 00-ff]
[   62.984596] pci-host-generic 30000000.pci: PCI host bridge to bus 0000:00
[   63.840571] pci_bus 0000:00: root bus resource [bus 00-ff]
[   64.064693] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
[   65.291642] pci_bus 0000:00: root bus resource [mem 0x40000000-0x7fffffff]
[   65.568760] pci_bus 0000:00: root bus resource [mem 0x400000000-0x7ffffffff]
[   67.008938] pci 0000:00:00.0: [1b36:0008] type 00 class 0x060000 conventional PCI endpoint
[   68.252824] pci_bus 0000:00: resource 4 [io  0x0000-0xffff]
[   68.484502] pci_bus 0000:00: resource 5 [mem 0x40000000-0x7fffffff]
[   69.715622] pci_bus 0000:00: resource 6 [mem 0x400000000-0x7ffffffff]
[   77.168602] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[   78.220759] printk: legacy console [ttyS0] disabled
[   78.876294] 10000000.serial: ttyS0 at MMIO 0x10000000 (irq = 12, base_baud = 230400) is a 16550A
[   79.993566] printk: legacy console [ttyS0] enabled
[   79.993566] printk: legacy console [ttyS0] enabled
[   81.183521] printk: legacy bootconsole [ns16550a0] disabled
[   81.183521] printk: legacy bootconsole [ns16550a0] disabled
[   83.813724] loop: module loaded
[   84.685123] e1000e: Intel(R) PRO/1000 Network Driver
[   85.307480] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[   86.153988] usbcore: registered new interface driver uas
[   86.851950] usbcore: registered new interface driver usb-storage
[   87.603966] mousedev: PS/2 mouse device common for all mice
[   88.476329] goldfish_rtc 101000.rtc: registered as rtc0
[   89.156799] goldfish_rtc 101000.rtc: setting system clock to 2026-02-10T04:06:11 UTC (1770696371)
[   90.385302] sdhci: Secure Digital Host Controller Interface driver
[   91.155538] sdhci: Copyright(c) Pierre Ossman
[   91.716300] Synopsys Designware Multimedia Card Interface Driver
[   92.477482] sdhci-pltfm: SDHCI platform and OF driver helper
[   93.332083] usbcore: registered new interface driver usbhid
[   94.051494] usbhid: USB HID core driver
[   94.631517] NET: Registered PF_INET6 protocol family
[   95.433896] Segment Routing with IPv6
[   95.916121] In-situ OAM (IOAM) with IPv6
[   96.409049] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[   97.232274] NET: Registered PF_PACKET protocol family
[   97.853961] 9pnet: Installing 9P2000 support
[   98.408778] Key type dns_resolver registered
[  102.009619] Legacy PMU implementation is available
[  109.665507] clk: Disabling unused clocks
[  109.829305] PM: genpd: Disabling unused power domains
[  110.435620] ALSA device list:
[  110.553265]   No soundcards found.
[  114.201551] Freeing initrd memory: 38256K
[  114.659325] Freeing unused kernel image (initmem) memory: 2436K
[  114.857756] Run /init as init process
[  114.979664]   with arguments:
[  115.081629]     /init
[  115.156619]   with environment:
[  115.267657]     HOME=/
[  115.344899]     TERM=linux
Starting syslogd: OK
Running sysctl: OK
Populating /dev using udev: [  118.863881] udevd[72]: starting version 3.2.14
[  119.369982] udevd[73]: starting eudev-3.2.14
done
Starting crond: OK

test login: root
[root@test /]# coremark 0x0 0x0 0x66 0 7 1 2000
2K performance run parameters for coremark.
CoreMark Size    : 666
Total ticks      : 8127
Total time (secs): 8.127000
Iterations/Sec   : 135.351298
ERROR! Must execute for at least 10 secs for a valid result!
Iterations       : 1100
Compiler version : GCC14.2.0
Compiler flags   : -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64  -O2 -g0    -lrt
Memory location  : Please put data memory location here
                        (e.g. code in flash, data on heap etc)
seedcrc          : 0xe9f5
[0]crclist       : 0xe714
[0]crcmatrix     : 0x1fd7
[0]crcstate      : 0x8e3a
[0]crcfinal      : 0x33ff
Errors detected
[root@test /]# coremark 0x3415 0x3415 0x66 0 7 1 2000
2K validation run parameters for coremark.
CoreMark Size    : 666
Total ticks      : 8063
Total time (secs): 8.063000
Iterations/Sec   : 136.425648
ERROR! Must execute for at least 10 secs for a valid result!
Iterations       : 1100
Compiler version : GCC14.2.0
Compiler flags   : -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64  -O2 -g0    -lrt
Memory location  : Please put data memory location here
                        (e.g. code in flash, data on heap etc)
seedcrc          : 0x18f2
[0]crclist       : 0xe3c1
[0]crcmatrix     : 0x0747
[0]crcstate      : 0x8d84
[0]crcfinal      : 0x30f5
Errors detected
[root@test /]# cat /proc/cpuinfo 
processor       : 0
hart            : 0
isa             : rv64imafdc_zicntr_zicsr_zifencei_zihpm_zaamo_zalrsc_zca_zcd
mmu             : sv39
mvendorid       : 0x0
marchid         : 0x2e
mimpid          : 0x0
hart isa        : rv64imafdc_zicntr_zicsr_zifencei_zihpm_zaamo_zalrsc_zca_zcd

@inochisa
Copy link
Copy Markdown
Contributor Author

I have add a simple doc for this implementation at SpinalHDL/VexiiRiscv-RTD#14

@Dolu1990
Copy link
Copy Markdown
Member

Dolu1990 commented Apr 2, 2026

Hi @inochisa ,

Linux could be a problem, not for a synthesized core, but for the simulation. In fact, I managed to boot an mainline OpenSBI with the simulation. And it gave me a 1.1GB trace log and a 5.1 GB fst wave. XD

Right ^^
The solution to that is to use the --dual-sim option in addition.
This will start 2 identitcal simulation without wave with a delay in between of about 1Mcycle.
When the simulation ahead crash, this will enable the wave on the second simulation for that last 1Mcycle.
That way you have the wave were it is about interresting. In other words, the first simulation is used as a trigger for the second simulation to start capture the wave.

Else in general, not using --with-wave at all (and without others similar options aswell)

And at least for now, my team are struggled to boot a linux with rootfs. Since my board has no SD card and I have to boot the kernel with ethernet.

How big is the kernel binary and rootfs ?
(For buildroot, i generaly boot using jtag, when it is boosted, it goes about 800Kbits/s, so if the images aren't tooooo big, that is ok. but probably to slow for your images

My suggestion is adding a batch of unit tests like defermelowie/riscv-hext-asm-tests (And I just use this to test the implementation).

Why not :)

(By this way, I think the spike for Vexiiriscv is too old to support H extension. XD)

Did you tried merging upstream ?
I did a few change in the fork, hopefully not toooo much.

With lots of unit test are required, I think a small framework and some common code are needed to invoke the unit test automatically

So far, in vexii, RegressionSingle does that. but it isn't a very nice code base XD

And a good news, The kvm unit test has passed.

Nice ^^

now we have linux in kvm virtual machine

So, to be sure i understand :

  • You have a FPGA board running VexiiRiscv + linux, in which you run a kvm virtual machine running linux (?!?)

I have add a simple doc for this implementation at SpinalHDL/VexiiRiscv-RTD#14

Thanks :D

@inochisa
Copy link
Copy Markdown
Contributor Author

inochisa commented Apr 2, 2026

Hi @Dolu1990,

Hi @inochisa ,

Linux could be a problem, not for a synthesized core, but for the simulation. In fact, I managed to boot an mainline OpenSBI with the simulation. And it gave me a 1.1GB trace log and a 5.1 GB fst wave. XD

Right ^^ The solution to that is to use the --dual-sim option in addition. This will start 2 identitcal simulation without wave with a delay in between of about 1Mcycle. When the simulation ahead crash, this will enable the wave on the second simulation for that last 1Mcycle. That way you have the wave were it is about interresting. In other words, the first simulation is used as a trigger for the second simulation to start capture the wave.

Else in general, not using --with-wave at all (and without others similar options aswell)

That's a good news. And after the test for a synthesized core. If there is already a way to boot OpenSBI and linux without hypervisor. It can also boot the linux with H extension, just set KVM is y. And I guess you already has a for it?

And at least for now, my team are struggled to boot a linux with rootfs. Since my board has no SD card and I have to boot the kernel with ethernet.

How big is the kernel binary and rootfs ? (For buildroot, i generaly boot using jtag, when it is boosted, it goes about 800Kbits/s, so if the images aren't tooooo big, that is ok. but probably to slow for your images

Oh, We have boot with another board with SD card. And we have a pretty big Image and rootfs. Because all the thing are build from upsteam (Linux is build from the defconfig and the initramfs is build from a full-featured buildroot rootfs (about 45Mib without compress))

My suggestion is adding a batch of unit tests like defermelowie/riscv-hext-asm-tests (And I just use this to test the implementation).

Why not :)

Good, I will try to add it. But it will be a long fight XD.

(By this way, I think the spike for Vexiiriscv is too old to support H extension. XD)

Did you tried merging upstream ? I did a few change in the fork, hopefully not toooo much.

The last time I have tried to upgrade is a bad time. Too many compile error when upgrading the spike to the master. The spike is bad for not giving a new release for a long time, which make me hard to figure out the change that breaks API....

With lots of unit test are required, I think a small framework and some common code are needed to invoke the unit test automatically

So far, in vexii, RegressionSingle does that. but it isn't a very nice code base XD

I will take a look. Thanks.

And a good news, The kvm unit test has passed.

Nice ^^

now we have linux in kvm virtual machine

So, to be sure i understand :

* You have a FPGA board running VexiiRiscv + linux, in which you run a kvm virtual machine running linux (?!?)

Right. And all the softwares (linux, OpenSBI, rootfs) are using upstream version without any changes

I have add a simple doc for this implementation at SpinalHDL/VexiiRiscv-RTD#14

Thanks :D

@Dolu1990
Copy link
Copy Markdown
Member

Dolu1990 commented Apr 2, 2026

It can also boot the linux with H extension, just set KVM is y.

Ahhh nice ^^
So far, i was using https://github.com/SpinalHDL/NaxSoftware/tree/ae0c08016d0581b9ac79a4feb166e7a532369bd4/buildroot
to build the images, but it is quite outdated.

And I guess you already has a for it?

has a (?) for it ? (i don't understand :D)

The last time I have tried to upgrade is a bad time. Too many compile error when upgrading the spike to the master. The spike is bad for not giving a new release for a long time, which make me hard to figure out the change that breaks API....

In case, let me know, there is maybe some change i did which can be reverted. In particular, the one related to the MMU.

Right. And all the softwares (linux, OpenSBI, rootfs) are using upstream version without any changes

GG :D :D

@inochisa
Copy link
Copy Markdown
Contributor Author

inochisa commented Apr 2, 2026

It can also boot the linux with H extension, just set KVM is y.

Ahhh nice ^^ So far, i was using https://github.com/SpinalHDL/NaxSoftware/tree/ae0c08016d0581b9ac79a4feb166e7a532369bd4/buildroot to build the images, but it is quite outdated.

I just use the upstream buildroot. And it seems like no change is needed XD.

And I guess you already has a for it?

has a (?) for it ? (i don't understand :D)

has a working Linux with the way of simulation you have mentioned (without hypervisor enabled).

The last time I have tried to upgrade is a bad time. Too many compile error when upgrading the spike to the master. The spike is bad for not giving a new release for a long time, which make me hard to figure out the change that breaks API....

In case, let me know, there is maybe some change i did which can be reverted. In particular, the one related to the MMU.

Great, I will take a look for it.

Right. And all the softwares (linux, OpenSBI, rootfs) are using upstream version without any changes

GG :D :D

inochisa and others added 19 commits May 5, 2026 18:33
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Longbin Li <looong.bin@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Longbin Li <looong.bin@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
@inochisa
Copy link
Copy Markdown
Contributor Author

inochisa commented May 6, 2026

A small update for hypervisor Zicbom support.

@inochisa
Copy link
Copy Markdown
Contributor Author

inochisa commented May 6, 2026

@Dolu1990 A bad news, I have failed to porting the latest riscv-arch-test for VexiiRiscv. The test now requires a 16550 UART. XD
Just need to porting the IO command, I am working on it
Bad news again, there is no hypervisor support for sail riscv, but I have ported to the latest riscv-arch-test successfully.

And I guess it could be better to use a real CLINT or PLIC for the TestBench?

inochisa added 3 commits May 6, 2026 16:27
…lugin

Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
inochisa added 2 commits May 8, 2026 16:48
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
@Dolu1990
Copy link
Copy Markdown
Member

Dolu1990 commented May 9, 2026

Hi,

And I guess it could be better to use a real CLINT or PLIC for the TestBench?

Ahhh the idea was to keep the testbench as light as possible and to emulate everything around.
Maybe your proposal could be done by implementing a new plugin which would intercept the IO bus and create the CLINT PLIC directly inside the testbench ?

Or else, putting in place a little simulation SoC.

@inochisa
Copy link
Copy Markdown
Contributor Author

inochisa commented May 9, 2026

Hi,

And I guess it could be better to use a real CLINT or PLIC for the TestBench?

Ahhh the idea was to keep the testbench as light as possible and to emulate everything around.

I agree. This is necessary to make the simulation fast.

Maybe your proposal could be done by implementing a new plugin which would intercept the IO bus and create the CLINT PLIC directly inside the testbench ?

Or else, putting in place a little simulation SoC.

Yeah, at least for now I think nothing needed. I have misunderstand the requirement for the "riscv-arch-test", and in fact it does not require a real PLIC/CLINT. Anyway, I am kind of upset as no hypervisor support for it. (I have told someone is doing this at the last RISC-V sig-hypervisor meeting, but I am sure it will be a long story to finish it XD)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants