Conversation
📝 WalkthroughWalkthroughThis PR fundamentally restructures the VM and vCPU abstraction layers. It removes execution responsibilities from the generic Changes
Sequence DiagramsequenceDiagram
participant CLI as CLI
participant Vmm
participant Virt as Virt (Hypervisor)
participant Vm as Vm Instance
participant VcpuMgr as VcpuManager
participant Vcpu
participant Handler as DeviceVmExitHandler
CLI->>Vmm: new(Box::new(AppleHypervisor))
activate Vmm
CLI->>Vmm: create_vm_from_config(VmConfig)
activate Vmm
Vmm->>Virt: create_vm()
activate Virt
Virt->>Vm: new() + hv_vm_create
activate Vm
Vm-->>Virt: Arc<dyn Vm>
deactivate Vm
Virt-->>Vmm: Arc<dyn Vm>
deactivate Virt
Vmm->>Vmm: allocate & map RAM<br/>create IRQ chip<br/>init devices
Vmm->>VcpuMgr: new(vm_instance)
activate VcpuMgr
VcpuMgr-->>Vmm: VcpuManager
deactivate VcpuMgr
loop for each vCPU
Vmm->>VcpuMgr: create_vcpu(id, handler, psci)
activate VcpuMgr
VcpuMgr->>Vm: create_vcpu(id, handler, psci)
activate Vm
Vm->>Vcpu: new(id, handler, psci)
activate Vcpu
Vcpu-->>Vm: HvpVcpu (no hv_vcpu_create yet)
deactivate Vcpu
Vm-->>VcpuMgr: Arc<Mutex<dyn Vcpu>>
deactivate Vm
VcpuMgr-->>Vmm: ()
deactivate VcpuMgr
end
Vmm-->>CLI: create_vm_from_config() done
deactivate Vmm
CLI->>Vmm: run(bootloader)
activate Vmm
Vmm->>Vmm: bootloader.load(...)
Vmm->>VcpuMgr: start_vcpu(0, start_pc, x0)
activate VcpuMgr
Note over VcpuMgr: Spawn thread per vCPU
VcpuMgr->>Vcpu: post_init_within_thread()<br/>(calls hv_vcpu_create)
activate Vcpu
Vcpu-->>VcpuMgr: ()
deactivate Vcpu
loop VM execution
VcpuMgr->>Vcpu: run()
activate Vcpu
Vcpu->>Handler: vm_exit_handler()
Handler-->>Vcpu: Arc<dyn DeviceVmExitHandler>
Vcpu-->>VcpuMgr: VmExitReason
deactivate Vcpu
VcpuMgr->>VcpuMgr: handle_vm_exit result
end
VcpuMgr-->>Vmm: ()
deactivate VcpuMgr
Vmm-->>CLI: run() done
deactivate Vmm
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 17
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm-core/src/virt/kvm/vcpu/arch/x86_64.rs (1)
33-37:⚠️ Potential issue | 🟡 MinorGuest debugging enabled unconditionally may impact performance.
KVM_GUESTDBG_SINGLESTEPis enabled by default ininit_arch_vcpu. This causes the guest to exit after every instruction, which significantly degrades performance. Consider making this configurable or removing it for production use.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm-core/src/virt/kvm/vcpu/arch/x86_64.rs` around lines 33 - 37, init_arch_vcpu currently enables guest debugging with KVM_GUESTDBG_SINGLESTEP unconditionally by calling set_guest_debug with kvm_guest_debug, which forces single‑step exits and hurts performance; update init_arch_vcpu to make single‑step optional (e.g., add a config flag or feature) and only include KVM_GUESTDBG_SINGLESTEP in the control field when that flag is set (leave KVM_GUESTDBG_ENABLE as needed), or remove the single‑step bit for production paths so set_guest_debug / kvm_guest_debug are invoked without KVM_GUESTDBG_SINGLESTEP unless explicitly requested.
🧹 Nitpick comments (8)
crates/vm-core/src/lib.rs (1)
1-1: Re-enable a warnings gate to avoid silent regressions.Commenting out
#![deny(warnings)]weakens CI/compiler feedback and can let quality issues accumulate. Prefer keeping it enabled and adding targeted#[allow(...)]where intentionally needed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm-core/src/lib.rs` at line 1, Uncomment the crate-level attribute that enforces warnings as errors by restoring #![deny(warnings)] at the top of the crate (undo the commented-out line "// #![deny(warnings)]") so the compiler treats warnings as failures; if specific warnings are noisy, add targeted #[allow(...)] annotations on the offending items rather than disabling the deny for the whole crate.crates/vm-machine/src/lib.rs (1)
1-1: Please keep a warnings gate at crate level.Disabling
#![deny(warnings)]reduces compile-time enforcement and can hide regressions; prefer explicit targeted allows instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm-machine/src/lib.rs` at line 1, The crate-level warnings gate was removed (the commented-out #![deny(warnings)]), re-enable a crate-level deny by restoring the attribute at the top of lib.rs (uncomment or add #![deny(warnings)]) and, if specific warnings are noisy, replace with targeted allows (e.g., #![allow(...)] for specific lint names) rather than disabling the whole deny; look for the crate root attribute location in lib.rs to apply this change.crates/vm-machine/src/vmm.rs (1)
103-103: Useexpect()instead ofunwrap()for better diagnostics.♻️ Proposed fix
- vcpu_manager.lock().unwrap().create_vcpu( + vcpu_manager.lock().expect("vcpu_manager mutex poisoned").create_vcpu(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm-machine/src/vmm.rs` at line 103, The call vcpu_manager.lock().unwrap().create_vcpu should use expect() instead of unwrap() to provide a clear panic message; replace the unwrap() on the MutexGuard acquisition (vcpu_manager.lock().unwrap()) with vcpu_manager.lock().expect("failed to lock vcpu_manager mutex when creating vCPU") so that the failure context is included when calling create_vcpu; ensure the expect message references vcpu_manager and create_vcpu to aid diagnostics.crates/vm-core/src/virt/hvp/vcpu.rs (1)
187-196: Consider documenting the thread-affinity requirement.The method name
post_init_within_threadsuggests it must be called from the vCPU's dedicated thread, but this contract isn't enforced. Add documentation clarifying this requirement.📝 Proposed documentation
+ /// Initializes the vCPU within its dedicated thread. + /// + /// # Safety Contract + /// This method MUST be called from the thread that will execute this vCPU. + /// All subsequent calls to `run()` and register accessors MUST occur from + /// the same thread. Violating this invariant causes undefined behavior. fn post_init_within_thread(&mut self) -> Result<(), VcpuError> {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm-core/src/virt/hvp/vcpu.rs` around lines 187 - 196, Add a doc comment to post_init_within_thread in vcpu.rs stating it must be called from the vCPU's dedicated thread (thread-affinity requirement), explaining that hv_vcpu_create expects to run on that thread and that calling it from any other thread is undefined/unsafe and may cause errors; reference the function name post_init_within_thread and the fields it sets (self.handler and self.exit) so readers know the contract and side effects.crates/vm-core/src/virt/kvm/vcpu/arch/aarch64.rs (1)
31-32: Replaceassert_eq!with proper error handling.Using
assert_eq!for length validation will panic in release builds if the KVM ioctl returns an unexpected length. Consider returning an error instead.♻️ Proposed fix
fn get_core_reg(&mut self, reg: CoreRegister) -> Result<u64, VcpuError> { let mut bytes = [0; 8]; let len = self.vcpu_fd.get_one_reg(reg.to_kvm_reg(), &mut bytes)?; - assert_eq!(len, 8); + if len != 8 { + return Err(VcpuError::InvalidRegisterSize { expected: 8, actual: len }); + } let value = u64::from_le_bytes(bytes); Ok(value) }Also applies to: 41-42
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm-core/src/virt/kvm/vcpu/arch/aarch64.rs` around lines 31 - 32, Replace the panic caused by assert_eq!(len, 8) with proper error handling: after calling self.vcpu_fd.get_one_reg(reg.to_kvm_reg(), &mut bytes) check if len != 8 and return a descriptive error (propagate a suitable crate error or io::Error with context like "unexpected register size from KVM for reg {reg}") instead of asserting; do the same for the other occurrence (the second get_one_reg check around the later len variable) so both places return an Err with context rather than panicking.crates/vm-core/src/arch/aarch64/vcpu.rs (2)
28-32: Use named constants for PState bit manipulation.Magic constants
0x03C0,0xf, and0x0005are hard to understand. Consider defining named constants for DAIF mask and EL1h mode.♻️ Proposed refactor
+const PSTATE_DAIF_MASK: u64 = 0x03C0; // D, A, I, F bits +const PSTATE_MODE_MASK: u64 = 0x0F; // Mode bits [3:0] +const PSTATE_EL1H: u64 = 0x05; // EL1 with SP_EL1 + pub fn setup_cpu(...) -> Result<(), VcpuError> { // ... { let mut pstate = vcpu.get_core_reg(CoreRegister::PState)?; - pstate |= 0x03C0; // DAIF - pstate &= !0xf; // Clear low 4 bits - pstate |= 0x0005; // El1h + pstate |= PSTATE_DAIF_MASK; + pstate &= !PSTATE_MODE_MASK; + pstate |= PSTATE_EL1H; vcpu.set_core_reg(CoreRegister::PState, pstate)?; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm-core/src/arch/aarch64/vcpu.rs` around lines 28 - 32, Replace the magic numeric masks used when adjusting the CPU PState with named constants to improve readability: introduce constants like DAIF_MASK (0x03C0), LOW4_MASK (0xF) or LOW4_CLEAR (~0xF), and EL1H_MODE (0x0005) and then use them in the sequence that reads and updates pstate (the variable pstate, CoreRegister::PState, and the calls vcpu.get_core_reg / vcpu.set_core_reg) so the code becomes pstate |= DAIF_MASK; pstate &= !LOW4_MASK; pstate |= EL1H_MODE; keeping the same logic but replacing the literals with descriptive constant names.
35-37: Dead code blocks withif false { todo!() }should be removed or converted to comments.Multiple
if false { ... }blocks containtodo!()placeholders that will never execute. This is confusing dead code. Consider either:
- Removing them entirely
- Converting to
// TODO:comments- Using feature flags if the functionality is planned
♻️ Example refactor for one block
- // more, non secure el1 - if false { - todo!() - } + // TODO: Configure non-secure EL1 if neededAlso applies to: 52-56, 58-63, 75-77
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm-core/src/arch/aarch64/vcpu.rs` around lines 35 - 37, Remove the unreachable "if false { todo!() }" blocks in vcpu.rs (or convert them into clear comments or gated feature-flagged code); specifically locate each occurrence (the blocks at the shown locations) and either delete the dead branch or replace it with a `// TODO: ...` comment or a cfg/gate for planned functionality so no `todo!()` is left inside an always-false conditional; ensure any intended future work is documented or controlled by feature flags and that no unreachable `todo!()` calls remain in functions/impls within this module.crates/vm-core/src/vcpu/vcpu_manager.rs (1)
53-53: Useexpect()or handle poisoned mutex gracefully.If another vCPU thread panics while holding a mutex (e.g., due to an unhandled error), this
unwrap()will panic and crash the entire VM. Consider usingexpect()with a descriptive message or handling thePoisonErrorexplicitly.♻️ Proposed fix
- let mut vcpu = vcpu.lock().unwrap(); + let mut vcpu = vcpu.lock().expect("vCPU mutex poisoned - another vCPU thread panicked");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm-core/src/vcpu/vcpu_manager.rs` at line 53, The code currently does vcpu.lock().unwrap() which will panic the whole VM if the mutex is poisoned; replace the unwrap with either vcpu.lock().expect("failed to lock vcpu mutex: mutex poisoned") or explicitly handle PoisonError by matching vcpu.lock() -> Ok(mut vcpu) => { ... } and Err(poison) => { let mut vcpu = poison.into_inner(); /* log warning about poisoned mutex */ ... } so the vCPU mutex is recovered gracefully and a descriptive message is logged; update the occurrence of vcpu.lock().unwrap() in vcpu_manager.rs accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm-cli/src/main.rs`:
- Around line 3-9: The code calls the bail! macro but never brings it into
scope; either add an import for the macro (e.g., use anyhow::bail;) near the
other imports or qualify calls as anyhow::bail!(...) where used (e.g., in the
function that currently invokes bail!). Update the top-level imports in main.rs
(alongside anyhow::Result) or change the bail! invocation to anyhow::bail! to
resolve the compilation error.
- Around line 55-58: The match arm for Accel::Kvm currently contains a todo!()
which will panic at runtime; replace that reachable panic with a proper,
non-panicking fallback: either return a structured error (e.g., propagate an Err
with a clear message like "KVM backend not implemented" from the function that
matches on Accel::Kvm) or map the variant to a safe fallback backend if
supported; locate the Accel::Kvm match arm in main.rs and change the todo!() to
return/propagate the error (using the function's Result/error type) or
remove/exclude the KVM option until implementation is added.
In `@crates/vm-core/src/arch/aarch64/firmware/psci/psci_0_2.rs`:
- Around line 53-58: The CPU_ON64 handler currently casts the guest MPIDR from
get_smc_arg1() directly to a vCPU index and calls
self.vcpu_manager.lock().unwrap().start_vcpu(...).unwrap(), which mis-maps MPIDs
in non-trivial topologies and panics on errors; change it to (1) resolve the
guest MPIDR to the correct zero-based vCPU index via your VM/CPUID mapping
routine (e.g., call the existing MPIDR→vcpu lookup helper or add one) instead of
casting target_cpu directly, (2) avoid unwraps: handle the lock() Result and the
start_vcpu Result and translate failures into the appropriate PSCI status codes
(e.g., INVALID_PARAMETERS, ALREADY_ON, DENIED, INTERNAL_FAILURE) returned by the
PSCI handler, and (3) ensure the call site uses the resolved vcpu index when
invoking VcpuManager::start_vcpu rather than target_cpu as usize so
non-sequential MPIDR topologies work correctly.
In `@crates/vm-core/src/arch/aarch64/vcpu.rs`:
- Line 17: The MPIDR_EL1 value is being set by directly casting cpu_id to u64
which violates the ARM bitfield layout; update the code around the
vcpu.set_sys_reg(SysRegister::MpidrEl1, ...) call to build MPIDR_EL1 with the
RES1 bit (bit 31) set and proper affinity fields (Aff3–Aff0 in bits
[39:32],[23:16],[15:8],[7:0]) while keeping reserved bits zero; for
single-cluster use (set bit 31) and place cpu_id into Aff0, or for multi-cluster
split cpu_id into Aff0–Aff3 and OR them into the proper bit positions before
passing the constructed value to vcpu.set_sys_reg(SysRegister::MpidrEl1, ...).
In `@crates/vm-core/src/arch/aarch64/vcpu/reg/esr_el2.rs`:
- Around line 38-42: The ec() method eagerly allocates the error string because
ok_or(...) calls format! on every invocation; change it to cache the raw value
(call self.ec_raw() once into a local like raw) and use ok_or_else(...) so the
GuestError string is only constructed on the Err path (e.g.,
Ec::from_repr(raw).ok_or_else(|| VcpuError::GuestError(format!("unknown ec:
0x{:x}", raw)))).
In `@crates/vm-core/src/vcpu/vcpu_manager.rs`:
- Around line 45-50: The start_vcpu function currently indexes the vcpus Vec by
vcpu_id which assumes contiguous ordering; change the lookup to be id-based:
either replace the vcpus Vec with a HashMap<usize, Arc<Mutex<dyn Vcpu>>> and
lookup by key, or if keeping Vec, iterate/search for a vCPU whose own id matches
(e.g., call a getter like vcpu.id() or compare a stored id field) and return
VcpuError::VcpuNotCreated(vcpu_id) when not found; update all codepaths that
construct or access vcpus (creation code and other getters) to use the new map
or the validated lookup to ensure correct vCPU selection in start_vcpu.
- Around line 52-80: The vCPU thread spawned in the closure that calls
post_init_within_thread(), setup_cpu(), vcpu.run(), and handle_vm_exit() returns
Result<(), VcpuError> but the JoinHandle is immediately dropped so errors are
lost; modify VcpuManager to capture and store the JoinHandle (or send
Result<VcpuError> back via a channel) so the main thread can observe
failures—locate the thread::spawn(...) block around the closure, return or push
the JoinHandle into the manager's handle collection (or create an error channel
and send Err(e) before terminating the thread) and ensure callers can join/check
the handle or receive the propagated error.
In `@crates/vm-core/src/virt/hvp.rs`:
- Around line 195-201: The vm_config handle from hv_vm_config_create is never
released on error; wrap it in an RAII guard or ensure os_release(vm_config) is
called on every error path inside create_vm. Specifically, create a small guard
type (e.g., HvVmConfigGuard) that stores the raw vm_config, calls os_release in
Drop, and exposes into_raw() or disarm() so you can transfer ownership before
calling hv_vm_create; then replace the direct
hv_vm_config_create/hv_vm_config_set_el2_enabled/hv_vm_create sequence in
create_vm to use the guard and disarm it only after hv_vm_create succeeds,
ensuring hv_vm_config_set_el2_enabled failures or hv_vm_create failures trigger
automatic os_release via the guard.
In `@crates/vm-core/src/virt/hvp/vcpu.rs`:
- Around line 124-131: The current try_get_exit_info function unsafely
dereferences self.exit without verifying thread-affinity; add a thread-affinity
check (e.g., call or implement an is_on_vcpu_thread/ensure_on_vcpu_thread helper
that compares the current thread id to the vCPU's recorded thread id) at the top
of try_get_exit_info and return a suitable VcpuError (e.g.,
VcpuError::WrongThread or a new variant) if the check fails, and only then
perform the unsafe dereference of
self.exit.as_ref().ok_or(VcpuError::VcpuNotCreated(self.vcpu_id))? so that the
pointer is only accessed on the vCPU thread.
- Line 100: HvpVcpu is currently marked unsafe impl Send which is unsound
because its exit raw pointer (the exit field returned by hv_vcpu_create) is
thread-local and must only be used on the creating thread; remove the unsafe
impl Send for HvpVcpu (or replace it with !Send) and ensure the type system
prevents cross-thread use, and/or add runtime thread-affinity checks in HvpVcpu
(store the creating thread id in HvpVcpu and assert in try_get_exit_info() and
run()) so any attempt to move or call run() from a different thread (or via
VcpuManager mutex acquired on another thread) is rejected with a clear error.
In `@crates/vm-core/src/virt/kvm.rs`:
- Around line 28-29: KvmVirt::create_vm currently panics via todo!(), breaking
any caller that selects the KVM backend; replace the unconditional panic by
either constructing and returning the proper Vm implementation (wire the new Vm
object through the KvmVirt factory) or, if not yet implemented, return a
concrete Error variant (e.g., an Error::NotImplemented or a backend-specific
error) instead of panicking so callers can handle it; update the create_vm
signature's Result path to return Ok(Arc<dyn Vm>) when wiring the real VM or
Err(Error::...) when returning the typed error, referencing KvmVirt::create_vm
and the project’s Error enum to choose the correct variant.
In `@crates/vm-core/src/virt/kvm/vcpu/arch/aarch64.rs`:
- Around line 25-27: The aarch64 KVM vCPU currently contains unimplemented
todo!()s; either fully implement these APIs or mark KVM/aarch64 unsupported. To
implement: add fields to KvmVcpu for psci_handler: Arc<dyn Psci> and
device_vm_exit_handler: Arc<dyn DeviceVmExitHandler>, initialize them in
init_arch_vcpu(), and implement get_psci_handler(), vm_exit_handler(),
get_sys_reg(), set_sys_reg(), post_init_within_thread(), and run() to use those
fields and the existing vcpu_fd/vcpu_id. Alternatively, if you prefer to block
usage until implemented, replace the todo!()s with clear runtime errors (or
Result-based returns) indicating KVM aarch64 is unsupported and update any
feature gating/docs accordingly so enabling the feature fails fast with a clear
message.
In `@crates/vm-core/src/virt/kvm/vcpu/arch/x86_64.rs`:
- Around line 75-86: The KVM x86_64 backend currently panics because KvmVcpu's
Vcpu trait methods vm_exit_handler, post_init_within_thread, and run are left as
todo!(); implement these methods on the KvmVcpu type mirroring the aarch64
backend behavior: return the appropriate Arc<dyn DeviceVmExitHandler> from
vm_exit_handler, perform per-thread initialization and return Result<(),
VcpuError> in post_init_within_thread, and execute the vCPU loop returning
Result<VmExitReason, VcpuError> in run; use the same error handling, state
setup, and device exit dispatch patterns found in the aarch64 implementation
(match the method names vm_exit_handler, post_init_within_thread, run and the
KvmVcpu struct) so runtime panics are eliminated.
In `@crates/vm-machine/src/error.rs`:
- Around line 6-10: Update the error display strings on the VmAlreadyExists and
VmNotExists variants so they are clear and grammatical: change "Vm already
exists" to "VM already exists" and "Vm not exists" to "VM does not exist". Edit
the #[error(...)] attributes for the VmAlreadyExists and VmNotExists enum
variants in error.rs to use the new messages and keep the variant names
unchanged.
In `@crates/vm-machine/src/vm.rs`:
- Around line 55-56: The Vm::boot implementation currently panics on non-aarch64
builds because of the #[cfg(not(target_arch = "aarch64"))] todo!() branch;
replace that panic with a proper error return so the shared VMM API is usable on
other architectures. Update the non-aarch64 branch in Vm::boot (the block
guarded by #[cfg(not(target_arch = "aarch64"))]) to return an appropriate Err
(e.g., an UnsupportedArchitecture or BootError variant) instead of calling
todo!(), and ensure the function signature and callers handle that Result error
path consistently.
In `@crates/vm-machine/src/vmm.rs`:
- Around line 102-109: The loop holds the vcpu_manager mutex while calling
vcpu_manager.lock().unwrap().create_vcpu(...), which can deadlock because the
PSCI CpuOn64 handler (in psci_0_2.rs) also locks vcpu_manager from a vCPU
thread; refactor so the mutex is not held while vCPU threads may start: change
VcpuManager::create_vcpu into two steps (e.g., prepare_vcpu that performs
internal state mutation while holding the lock, then release the lock and call
start_vcpu/spawn_vcpu which does thread creation without holding vcpu_manager),
and update this loop to call prepare_vcpu under the lock and then start_vcpu
after dropping the lock; reference vcpu_manager, create_vcpu, and the PSCI
CpuOn64 handler in psci_0_2.rs when making the change.
- Line 133: The empty busy-wait loop after boot() in vmm.rs should be replaced
with a blocking wait so the main thread doesn't spin at 100% CPU; remove loop {}
and instead park the thread or block on a synchronization primitive (e.g.,
thread::park(), Condvar wait, or recv() on a shutdown channel) and wire that to
the existing vCPU lifecycle/shutdown path (for example use the
VcpuManager::start_vcpu shutdown signal or expose a shutdown Receiver that the
main thread can block on), ensuring the main thread unblocks when a
shutdown/stop event occurs.
---
Outside diff comments:
In `@crates/vm-core/src/virt/kvm/vcpu/arch/x86_64.rs`:
- Around line 33-37: init_arch_vcpu currently enables guest debugging with
KVM_GUESTDBG_SINGLESTEP unconditionally by calling set_guest_debug with
kvm_guest_debug, which forces single‑step exits and hurts performance; update
init_arch_vcpu to make single‑step optional (e.g., add a config flag or feature)
and only include KVM_GUESTDBG_SINGLESTEP in the control field when that flag is
set (leave KVM_GUESTDBG_ENABLE as needed), or remove the single‑step bit for
production paths so set_guest_debug / kvm_guest_debug are invoked without
KVM_GUESTDBG_SINGLESTEP unless explicitly requested.
---
Nitpick comments:
In `@crates/vm-core/src/arch/aarch64/vcpu.rs`:
- Around line 28-32: Replace the magic numeric masks used when adjusting the CPU
PState with named constants to improve readability: introduce constants like
DAIF_MASK (0x03C0), LOW4_MASK (0xF) or LOW4_CLEAR (~0xF), and EL1H_MODE (0x0005)
and then use them in the sequence that reads and updates pstate (the variable
pstate, CoreRegister::PState, and the calls vcpu.get_core_reg /
vcpu.set_core_reg) so the code becomes pstate |= DAIF_MASK; pstate &=
!LOW4_MASK; pstate |= EL1H_MODE; keeping the same logic but replacing the
literals with descriptive constant names.
- Around line 35-37: Remove the unreachable "if false { todo!() }" blocks in
vcpu.rs (or convert them into clear comments or gated feature-flagged code);
specifically locate each occurrence (the blocks at the shown locations) and
either delete the dead branch or replace it with a `// TODO: ...` comment or a
cfg/gate for planned functionality so no `todo!()` is left inside an
always-false conditional; ensure any intended future work is documented or
controlled by feature flags and that no unreachable `todo!()` calls remain in
functions/impls within this module.
In `@crates/vm-core/src/lib.rs`:
- Line 1: Uncomment the crate-level attribute that enforces warnings as errors
by restoring #![deny(warnings)] at the top of the crate (undo the commented-out
line "// #![deny(warnings)]") so the compiler treats warnings as failures; if
specific warnings are noisy, add targeted #[allow(...)] annotations on the
offending items rather than disabling the deny for the whole crate.
In `@crates/vm-core/src/vcpu/vcpu_manager.rs`:
- Line 53: The code currently does vcpu.lock().unwrap() which will panic the
whole VM if the mutex is poisoned; replace the unwrap with either
vcpu.lock().expect("failed to lock vcpu mutex: mutex poisoned") or explicitly
handle PoisonError by matching vcpu.lock() -> Ok(mut vcpu) => { ... } and
Err(poison) => { let mut vcpu = poison.into_inner(); /* log warning about
poisoned mutex */ ... } so the vCPU mutex is recovered gracefully and a
descriptive message is logged; update the occurrence of vcpu.lock().unwrap() in
vcpu_manager.rs accordingly.
In `@crates/vm-core/src/virt/hvp/vcpu.rs`:
- Around line 187-196: Add a doc comment to post_init_within_thread in vcpu.rs
stating it must be called from the vCPU's dedicated thread (thread-affinity
requirement), explaining that hv_vcpu_create expects to run on that thread and
that calling it from any other thread is undefined/unsafe and may cause errors;
reference the function name post_init_within_thread and the fields it sets
(self.handler and self.exit) so readers know the contract and side effects.
In `@crates/vm-core/src/virt/kvm/vcpu/arch/aarch64.rs`:
- Around line 31-32: Replace the panic caused by assert_eq!(len, 8) with proper
error handling: after calling self.vcpu_fd.get_one_reg(reg.to_kvm_reg(), &mut
bytes) check if len != 8 and return a descriptive error (propagate a suitable
crate error or io::Error with context like "unexpected register size from KVM
for reg {reg}") instead of asserting; do the same for the other occurrence (the
second get_one_reg check around the later len variable) so both places return an
Err with context rather than panicking.
In `@crates/vm-machine/src/lib.rs`:
- Line 1: The crate-level warnings gate was removed (the commented-out
#![deny(warnings)]), re-enable a crate-level deny by restoring the attribute at
the top of lib.rs (uncomment or add #![deny(warnings)]) and, if specific
warnings are noisy, replace with targeted allows (e.g., #![allow(...)] for
specific lint names) rather than disabling the whole deny; look for the crate
root attribute location in lib.rs to apply this change.
In `@crates/vm-machine/src/vmm.rs`:
- Line 103: The call vcpu_manager.lock().unwrap().create_vcpu should use
expect() instead of unwrap() to provide a clear panic message; replace the
unwrap() on the MutexGuard acquisition (vcpu_manager.lock().unwrap()) with
vcpu_manager.lock().expect("failed to lock vcpu_manager mutex when creating
vCPU") so that the failure context is included when calling create_vcpu; ensure
the expect message references vcpu_manager and create_vcpu to aid diagnostics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5dc57c40-70d4-4f2f-951e-7539787c831f
📒 Files selected for processing (27)
crates/vm-cli/src/main.rscrates/vm-core/src/arch.rscrates/vm-core/src/arch/aarch64/firmware/psci.rscrates/vm-core/src/arch/aarch64/firmware/psci/psci_0_2.rscrates/vm-core/src/arch/aarch64/vcpu.rscrates/vm-core/src/arch/aarch64/vcpu/reg/esr_el2.rscrates/vm-core/src/arch/aarch64/vm_exit.rscrates/vm-core/src/arch/vcpu.rscrates/vm-core/src/arch/x86_64/vcpu.rscrates/vm-core/src/lib.rscrates/vm-core/src/vcpu.rscrates/vm-core/src/vcpu/error.rscrates/vm-core/src/vcpu/vcpu.rscrates/vm-core/src/vcpu/vcpu_manager.rscrates/vm-core/src/virt.rscrates/vm-core/src/virt/hvp.rscrates/vm-core/src/virt/hvp/vcpu.rscrates/vm-core/src/virt/kvm.rscrates/vm-core/src/virt/kvm/vcpu/arch/aarch64.rscrates/vm-core/src/virt/kvm/vcpu/arch/x86_64.rscrates/vm-machine/src/device.rscrates/vm-machine/src/error.rscrates/vm-machine/src/lib.rscrates/vm-machine/src/vm.rscrates/vm-machine/src/vm/config.rscrates/vm-machine/src/vm_builder.rscrates/vm-machine/src/vmm.rs
💤 Files with no reviewable changes (3)
- crates/vm-core/src/arch.rs
- crates/vm-core/src/arch/vcpu.rs
- crates/vm-machine/src/vm_builder.rs
| use clap::Parser; | ||
| use tracing::debug; | ||
| use tracing_subscriber::EnvFilter; | ||
| use vm_bootloader::boot_loader::BootLoaderBuilder; | ||
| use vm_core::virt::Virt; | ||
| use vm_machine::vm_builder::VmBuilder; | ||
| use vm_machine::vm::config::VmConfig; | ||
| use vm_machine::vmm::Vmm; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify that `bail!` is used without either `use anyhow::bail;`
# or a qualified `anyhow::bail!` call.
set -e
rg -n '\bbail!\s*\(' crates/vm-cli/src/main.rs
rg -n '^\s*use anyhow::bail;|anyhow::bail!\s*\(' crates/vm-cli/src/main.rs || trueRepository: junyu0312/rust-vm
Length of output: 115
🏁 Script executed:
cat -n crates/vm-cli/src/main.rs | head -80Repository: junyu0312/rust-vm
Length of output: 2461
Qualify bail! or import it here.
Line 64 calls bail!, but this macro is not brought into scope. The file imports anyhow::Result (line 40) but does not import the bail! macro, causing a compilation error.
Minimal fix
- bail!("hvp only supports aarch64");
+ anyhow::bail!("hvp only supports aarch64");Also applies to: 60-72
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm-cli/src/main.rs` around lines 3 - 9, The code calls the bail! macro
but never brings it into scope; either add an import for the macro (e.g., use
anyhow::bail;) near the other imports or qualify calls as anyhow::bail!(...)
where used (e.g., in the function that currently invokes bail!). Update the
top-level imports in main.rs (alongside anyhow::Result) or change the bail!
invocation to anyhow::bail! to resolve the compilation error.
| #[cfg(feature = "kvm")] | ||
| Accel::Kvm => { | ||
| #[cfg(target_arch = "aarch64")] | ||
| { | ||
| use vm_bootloader::boot_loader::arch::aarch64::AArch64BootLoader; | ||
| use vm_core::arch::aarch64::AArch64; | ||
| use vm_core::virt::kvm::KvmVirt; | ||
|
|
||
| build_and_run_vm::<KvmVirt<AArch64>, AArch64BootLoader>(args)?; | ||
| } | ||
|
|
||
| #[cfg(target_arch = "x86_64")] | ||
| { | ||
| use vm_bootloader::boot_loader::arch::x86_64::X86_64BootLoader; | ||
| use vm_core::arch::x86_64::X86_64; | ||
| use vm_core::virt::kvm::KvmVirt; | ||
|
|
||
| build_and_run_vm::<KvmVirt<X86_64>, X86_64BootLoader>(args)?; | ||
| } | ||
| todo!() | ||
| } |
There was a problem hiding this comment.
Don't leave Accel::Kvm as a reachable panic path.
Line 57 aborts the CLI whenever the user selects KVM. If this backend is intentionally deferred, return a structured error or hide the option until the implementation lands.
Safer fallback
#[cfg(feature = "kvm")]
Accel::Kvm => {
- todo!()
+ anyhow::bail!("KVM support is not implemented in this refactor yet");
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #[cfg(feature = "kvm")] | |
| Accel::Kvm => { | |
| #[cfg(target_arch = "aarch64")] | |
| { | |
| use vm_bootloader::boot_loader::arch::aarch64::AArch64BootLoader; | |
| use vm_core::arch::aarch64::AArch64; | |
| use vm_core::virt::kvm::KvmVirt; | |
| build_and_run_vm::<KvmVirt<AArch64>, AArch64BootLoader>(args)?; | |
| } | |
| #[cfg(target_arch = "x86_64")] | |
| { | |
| use vm_bootloader::boot_loader::arch::x86_64::X86_64BootLoader; | |
| use vm_core::arch::x86_64::X86_64; | |
| use vm_core::virt::kvm::KvmVirt; | |
| build_and_run_vm::<KvmVirt<X86_64>, X86_64BootLoader>(args)?; | |
| } | |
| todo!() | |
| } | |
| #[cfg(feature = "kvm")] | |
| Accel::Kvm => { | |
| anyhow::bail!("KVM support is not implemented in this refactor yet"); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm-cli/src/main.rs` around lines 55 - 58, The match arm for Accel::Kvm
currently contains a todo!() which will panic at runtime; replace that reachable
panic with a proper, non-panicking fallback: either return a structured error
(e.g., propagate an Err with a clear message like "KVM backend not implemented"
from the function that matches on Accel::Kvm) or map the variant to a safe
fallback backend if supported; locate the Accel::Kvm match arm in main.rs and
change the todo!() to return/propagate the error (using the function's
Result/error type) or remove/exclude the KVM option until implementation is
added.
| println!("try lock vcpu_manager"); | ||
| self.vcpu_manager | ||
| .lock() | ||
| .unwrap() | ||
| .start_vcpu(target_cpu as usize, entry_point_address, context_id) | ||
| .unwrap(); |
There was a problem hiding this comment.
CPU_ON64 needs an MPIDR→vCPU mapping and real error handling.
The value from get_smc_arg1() is the guest MPIDR, but VcpuManager::start_vcpu indexes a zero-based vCPU vector. Casting it directly on Line 57 will mis-target non-trivial topologies, and the unwrap() turns invalid/already-on requests into a host crash instead of a PSCI status code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm-core/src/arch/aarch64/firmware/psci/psci_0_2.rs` around lines 53 -
58, The CPU_ON64 handler currently casts the guest MPIDR from get_smc_arg1()
directly to a vCPU index and calls
self.vcpu_manager.lock().unwrap().start_vcpu(...).unwrap(), which mis-maps MPIDs
in non-trivial topologies and panics on errors; change it to (1) resolve the
guest MPIDR to the correct zero-based vCPU index via your VM/CPUID mapping
routine (e.g., call the existing MPIDR→vcpu lookup helper or add one) instead of
casting target_cpu directly, (2) avoid unwraps: handle the lock() Result and the
start_vcpu Result and translate failures into the appropriate PSCI status codes
(e.g., INVALID_PARAMETERS, ALREADY_ON, DENIED, INTERNAL_FAILURE) returned by the
PSCI handler, and (3) ensure the call site uses the resolved vcpu index when
invoking VcpuManager::start_vcpu rather than target_cpu as usize so
non-sequential MPIDR topologies work correctly.
| cpu_id: usize, | ||
| vcpu: &mut dyn AArch64Vcpu, | ||
| ) -> Result<(), VcpuError> { | ||
| vcpu.set_sys_reg(SysRegister::MpidrEl1, cpu_id as u64)?; |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What is the correct format for MPIDR_EL1 register on ARM64 for multi-core VMs?
💡 Result:
The MPIDR_EL1 register on ARM64 has the same format whether in physical hardware or virtualized multi-core VMs. When EL2 (hypervisor) is enabled, EL1 reads of MPIDR_EL1 return the value from VMPIDR_EL2, which the hypervisor configures to present virtual affinity levels to the guest VM. The bitfield layout is identical to the physical MPIDR_EL1. 64-bit register layout (MSB to LSB): Bits [63:40]: RES0 (Reserved) Bits [39:32]: Aff3 (Affinity level 3) Bit [31]: RES1 (Reserved) Bit [30]: U (Uniprocessor indicator: 1 if uniprocessor system, 0 if multiprocessor) Bits [29:25]: RES0 (Reserved) Bit [24]: MT (Multi-Thread indicator: 1 if lowest affinity level uses multithreading, 0 if independent PEs) Bits [23:16]: Aff2 (Affinity level 2) Bits [15:8]: Aff1 (Affinity level 1) Bits [7:0]: Aff0 (Affinity level 0) For multi-core VMs, the hypervisor sets unique {Aff3,Aff2,Aff1,Aff0} combinations across vCPUs (typically incrementing Aff0 for each vCPU, with higher Aff levels for clusters/nodes as needed). U=0 and MT=0 for standard multi-core setups. All Aff fields must be unique system-wide within the VM. Access: RO (read-only) from EL1/EL0 when trapped by hypervisor.
Citations:
- 1: https://developer.arm.com/documentation/ddi0601/2025-12/AArch64-Registers/MPIDR-EL1--Multiprocessor-Affinity-Register
- 2: https://developer.arm.com/documentation/ddi0601/2021-06/AArch64-Registers/MPIDR-EL1--Multiprocessor-Affinity-Register
- 3: https://developer.arm.com/documentation/ddi0487/maa/-Part-D-The-AArch64-System-Level-Architecture/-Chapter-D24-AArch64-System-Register-Descriptions/-D24-2-General-system-control-registers/-D24-2-138-MPIDR-EL1--Multiprocessor-Affinity-Register
- 4: https://df.lth.se/~getz/ARM/SysReg/AArch64-vmpidr_el2.html
- 5: http://df.lth.se/~getz/ARM/SysReg/AArch64-mpidr_el1.html
- 6: https://www.mavara-co.com
🏁 Script executed:
# First, let's examine the file and get context around line 17
cd crates/vm-core/src/arch/aarch64 && cat -n vcpu.rs | head -50Repository: junyu0312/rust-vm
Length of output: 1787
🏁 Script executed:
# Let's search for set_sys_reg implementation to understand how it handles values
rg "fn set_sys_reg" -A 10Repository: junyu0312/rust-vm
Length of output: 2531
🏁 Script executed:
# Search for MPIDR_EL1 usage across the codebase
rg "MPIDR|MpidrEl1" -iRepository: junyu0312/rust-vm
Length of output: 404
🏁 Script executed:
# Look for any affinity or topology-related code
rg "affinity|Aff[0-3]|topology" -iRepository: junyu0312/rust-vm
Length of output: 306
🏁 Script executed:
# Check if there's any helper to format MPIDR_EL1
rg "mpidr|MPIDR" -i -B 3 -A 3Repository: junyu0312/rust-vm
Length of output: 2298
🏁 Script executed:
# Look at the reg.rs file to see how registers are defined/formatted
head -100 crates/vm-core/src/arch/aarch64/vcpu/reg.rsRepository: junyu0312/rust-vm
Length of output: 2028
Construct MPIDR_EL1 with proper ARM bitfield format.
Direct casting cpu_id to u64 violates the ARM MPIDR_EL1 specification. The register requires specific bitfield layout:
- Bit [31] must be set to 1 (RES1), which the current implementation fails to do
- Bits [39:32], [23:16], [15:8], [7:0] must encode affinity levels (Aff3–Aff0)
- Bits [63:40], [29:25] must be 0 (RES0)
The current code only works for cpu_id < 256 by accident. Construct the value with proper affinity encoding, for example: (1u64 << 31) | (cpu_id as u64) for single-cluster configurations, or with Aff3–Aff0 populated according to your topology for multi-cluster setups.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm-core/src/arch/aarch64/vcpu.rs` at line 17, The MPIDR_EL1 value is
being set by directly casting cpu_id to u64 which violates the ARM bitfield
layout; update the code around the vcpu.set_sys_reg(SysRegister::MpidrEl1, ...)
call to build MPIDR_EL1 with the RES1 bit (bit 31) set and proper affinity
fields (Aff3–Aff0 in bits [39:32],[23:16],[15:8],[7:0]) while keeping reserved
bits zero; for single-cluster use (set bit 31) and place cpu_id into Aff0, or
for multi-cluster split cpu_id into Aff0–Aff3 and OR them into the proper bit
positions before passing the constructed value to
vcpu.set_sys_reg(SysRegister::MpidrEl1, ...).
| pub fn ec(&self) -> Result<Ec, VcpuError> { | ||
| Ec::from_repr(self.ec_raw()).ok_or(VcpuError::GuestError(format!( | ||
| "unknown ec: 0x{:x}", | ||
| self.ec_raw() | ||
| ))) |
There was a problem hiding this comment.
Avoid eager error allocation in ec() hot path.
ok_or(...) evaluates format! on every call, including success cases. Use ok_or_else(...) and cache ec_raw once to avoid unnecessary allocations during frequent exit decoding.
💡 Proposed fix
pub fn ec(&self) -> Result<Ec, VcpuError> {
- Ec::from_repr(self.ec_raw()).ok_or(VcpuError::GuestError(format!(
- "unknown ec: 0x{:x}",
- self.ec_raw()
- )))
+ let ec_raw = self.ec_raw();
+ Ec::from_repr(ec_raw).ok_or_else(|| {
+ VcpuError::GuestError(format!("unknown ec: 0x{ec_raw:x}"))
+ })
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub fn ec(&self) -> Result<Ec, VcpuError> { | |
| Ec::from_repr(self.ec_raw()).ok_or(VcpuError::GuestError(format!( | |
| "unknown ec: 0x{:x}", | |
| self.ec_raw() | |
| ))) | |
| pub fn ec(&self) -> Result<Ec, VcpuError> { | |
| let ec_raw = self.ec_raw(); | |
| Ec::from_repr(ec_raw).ok_or_else(|| { | |
| VcpuError::GuestError(format!("unknown ec: 0x{ec_raw:x}")) | |
| }) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm-core/src/arch/aarch64/vcpu/reg/esr_el2.rs` around lines 38 - 42,
The ec() method eagerly allocates the error string because ok_or(...) calls
format! on every invocation; change it to cache the raw value (call
self.ec_raw() once into a local like raw) and use ok_or_else(...) so the
GuestError string is only constructed on the Err path (e.g.,
Ec::from_repr(raw).ok_or_else(|| VcpuError::GuestError(format!("unknown ec:
0x{:x}", raw)))).
| impl Vcpu for KvmVcpu { | ||
| fn vm_exit_handler(&self) -> Arc<dyn DeviceVmExitHandler> { | ||
| todo!() | ||
| } | ||
|
|
||
| fn post_init_within_thread(&mut self) -> Result<(), VcpuError> { | ||
| todo!() | ||
| } | ||
|
|
||
| fn run(&mut self) -> Result<VmExitReason, VcpuError> { | ||
| todo!() | ||
| /* | ||
| loop { | ||
| let r = self.vcpu_fd.run(); | ||
|
|
||
| match r? { | ||
| VcpuExit::IoOut(port, data) => pio.io_out(port, data)?, | ||
| VcpuExit::IoIn(port, data) => pio.io_in(port, data)?, | ||
| VcpuExit::MmioRead(_, _) => { | ||
| // Ignore | ||
| } | ||
| VcpuExit::MmioWrite(_, _) => { | ||
| // Ignore | ||
| } | ||
| VcpuExit::Unknown => todo!(), | ||
| VcpuExit::Exception => todo!(), | ||
| VcpuExit::Hypercall(_) => todo!(), | ||
| VcpuExit::Debug(_) => {} | ||
| VcpuExit::Hlt => { | ||
| // warn!("hlt"); | ||
| // todo!() | ||
| } | ||
| VcpuExit::IrqWindowOpen => todo!(), | ||
| VcpuExit::Shutdown => todo!(), | ||
| VcpuExit::FailEntry(_, _) => todo!(), | ||
| VcpuExit::Intr => todo!(), | ||
| VcpuExit::SetTpr => todo!(), | ||
| VcpuExit::TprAccess => todo!(), | ||
| VcpuExit::S390Sieic => todo!(), | ||
| VcpuExit::S390Reset => todo!(), | ||
| VcpuExit::Dcr => todo!(), | ||
| VcpuExit::Nmi => todo!(), | ||
| VcpuExit::InternalError => { | ||
| let kvm_run = self.vcpu_fd.get_kvm_run(); | ||
| unsafe { | ||
| error!(?kvm_run.__bindgen_anon_1.internal, "InternalError"); | ||
| } | ||
| panic!(); | ||
| } | ||
| VcpuExit::Osi => todo!(), | ||
| VcpuExit::PaprHcall => todo!(), | ||
| VcpuExit::S390Ucontrol => todo!(), | ||
| VcpuExit::Watchdog => todo!(), | ||
| VcpuExit::S390Tsch => todo!(), | ||
| VcpuExit::Epr => todo!(), | ||
| VcpuExit::SystemEvent(_, _) => todo!(), | ||
| VcpuExit::S390Stsi => todo!(), | ||
| VcpuExit::IoapicEoi(_) => todo!(), | ||
| VcpuExit::Hyperv => todo!(), | ||
| VcpuExit::X86Rdmsr(_) => todo!(), | ||
| VcpuExit::X86Wrmsr(_) => todo!(), | ||
| VcpuExit::MemoryFault { .. } => todo!(), | ||
| VcpuExit::Unsupported(_) => todo!(), | ||
| } | ||
| } | ||
| */ | ||
| } |
There was a problem hiding this comment.
KVM x86_64 backend is incomplete.
Similar to the aarch64 backend, the Vcpu trait implementation has todo!() placeholders for all required methods. This backend will panic at runtime if used.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm-core/src/virt/kvm/vcpu/arch/x86_64.rs` around lines 75 - 86, The
KVM x86_64 backend currently panics because KvmVcpu's Vcpu trait methods
vm_exit_handler, post_init_within_thread, and run are left as todo!(); implement
these methods on the KvmVcpu type mirroring the aarch64 backend behavior: return
the appropriate Arc<dyn DeviceVmExitHandler> from vm_exit_handler, perform
per-thread initialization and return Result<(), VcpuError> in
post_init_within_thread, and execute the vCPU loop returning
Result<VmExitReason, VcpuError> in run; use the same error handling, state
setup, and device exit dispatch patterns found in the aarch64 implementation
(match the method names vm_exit_handler, post_init_within_thread, run and the
KvmVcpu struct) so runtime panics are eliminated.
| #[error("Vm already exists")] | ||
| VmAlreadyExists, | ||
|
|
||
| #[error("Vm not exists")] | ||
| VmNotExists, |
There was a problem hiding this comment.
Polish VM state error messages for clarity.
The strings read awkwardly (Vm not exists). Prefer concise, grammatical messages like “VM already exists” / “VM does not exist”.
✍️ Suggested wording update
- #[error("Vm already exists")]
+ #[error("VM already exists")]
VmAlreadyExists,
- #[error("Vm not exists")]
+ #[error("VM does not exist")]
VmNotExists,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #[error("Vm already exists")] | |
| VmAlreadyExists, | |
| #[error("Vm not exists")] | |
| VmNotExists, | |
| #[error("VM already exists")] | |
| VmAlreadyExists, | |
| #[error("VM does not exist")] | |
| VmNotExists, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm-machine/src/error.rs` around lines 6 - 10, Update the error display
strings on the VmAlreadyExists and VmNotExists variants so they are clear and
grammatical: change "Vm already exists" to "VM already exists" and "Vm not
exists" to "VM does not exist". Edit the #[error(...)] attributes for the
VmAlreadyExists and VmNotExists enum variants in error.rs to use the new
messages and keep the variant names unchanged.
| #[cfg(not(target_arch = "aarch64"))] | ||
| todo!(); |
There was a problem hiding this comment.
Vm::boot still hard-panics off aarch64.
Lines 55-56 make the shared VMM API unusable on every non-aarch64 build. Even after the frontend/backend wiring is restored, this path will still abort instead of returning an error or doing architecture-specific startup.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm-machine/src/vm.rs` around lines 55 - 56, The Vm::boot
implementation currently panics on non-aarch64 builds because of the
#[cfg(not(target_arch = "aarch64"))] todo!() branch; replace that panic with a
proper error return so the shared VMM API is usable on other architectures.
Update the non-aarch64 branch in Vm::boot (the block guarded by
#[cfg(not(target_arch = "aarch64"))]) to return an appropriate Err (e.g., an
UnsupportedArchitecture or BootError variant) instead of calling todo!(), and
ensure the function signature and callers handle that Result error path
consistently.
| for vcpu_id in 0..vm_config.vcpus { | ||
| vcpu_manager.lock().unwrap().create_vcpu( | ||
| vcpu_id, | ||
| device_manager.clone(), | ||
| #[cfg(target_arch = "aarch64")] | ||
| psci.clone(), | ||
| )?; | ||
| } |
There was a problem hiding this comment.
Potential deadlock with PSCI handler.
Based on the context snippet from psci_0_2.rs (lines 35-70), the PSCI CpuOn64 handler acquires vcpu_manager.lock() from within a vCPU thread. If this loop holds the lock while a vCPU thread is executing PSCI, a deadlock could occur.
The current code locks briefly per iteration which mitigates this, but consider documenting this constraint or using a lock-free approach for vCPU creation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm-machine/src/vmm.rs` around lines 102 - 109, The loop holds the
vcpu_manager mutex while calling vcpu_manager.lock().unwrap().create_vcpu(...),
which can deadlock because the PSCI CpuOn64 handler (in psci_0_2.rs) also locks
vcpu_manager from a vCPU thread; refactor so the mutex is not held while vCPU
threads may start: change VcpuManager::create_vcpu into two steps (e.g.,
prepare_vcpu that performs internal state mutation while holding the lock, then
release the lock and call start_vcpu/spawn_vcpu which does thread creation
without holding vcpu_manager), and update this loop to call prepare_vcpu under
the lock and then start_vcpu after dropping the lock; reference vcpu_manager,
create_vcpu, and the PSCI CpuOn64 handler in psci_0_2.rs when making the change.
| .ok_or(Error::VmNotExists)? | ||
| .boot(boot_loader)?; | ||
|
|
||
| loop {} |
There was a problem hiding this comment.
Busy-wait loop burns CPU unnecessarily.
The empty loop {} after boot() will spin at 100% CPU utilization. Since vCPU threads run independently (as shown in VcpuManager::start_vcpu), the main thread should park or wait on a synchronization primitive.
💡 Proposed fix
pub fn run(&mut self, boot_loader: &dyn BootLoader) -> Result<()> {
self.vm
.as_mut()
.ok_or(Error::VmNotExists)?
.boot(boot_loader)?;
- loop {}
+ // Park the main thread; vCPU threads handle execution
+ loop {
+ std::thread::park();
+ }
}Or better, wait on a shutdown signal/channel.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| loop {} | |
| // Park the main thread; vCPU threads handle execution | |
| loop { | |
| std::thread::park(); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm-machine/src/vmm.rs` at line 133, The empty busy-wait loop after
boot() in vmm.rs should be replaced with a blocking wait so the main thread
doesn't spin at 100% CPU; remove loop {} and instead park the thread or block on
a synchronization primitive (e.g., thread::park(), Condvar wait, or recv() on a
shutdown channel) and wire that to the existing vCPU lifecycle/shutdown path
(for example use the VcpuManager::start_vcpu shutdown signal or expose a
shutdown Receiver that the main thread can block on), ensuring the main thread
unblocks when a shutdown/stop event occurs.
Summary by CodeRabbit