Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -703,83 +703,116 @@ int32_t SchedulerContext::handshake_all_cores(Runtime *runtime) {

LOG_INFO_V0("Handshaking with %d cores", cores_total_num_);

// Step 1: Write per-core payload addresses and send handshake signal.
// OUT_OF_ORDER_STORE_BARRIER() ensures task is globally visible before
// aicpu_ready=1, so AICore reads the correct payload pointer after waking up.
// Step 1: Write per-core payload addresses, then release all cores. The
// task pointers are written first and published with a single barrier, then
// aicpu_ready is raised for every core. One barrier (not one per core)
// suffices: the barrier guarantees every task store is globally visible
// before any aicpu_ready store, which is the only ordering AICore relies on
// (it reads task only after observing aicpu_ready==1).
for (int32_t i = 0; i < cores_total_num_; i++) {
all_handshakes[i].task = reinterpret_cast<uint64_t>(&payload_per_core_[i][0]);
OUT_OF_ORDER_STORE_BARRIER();
}
OUT_OF_ORDER_STORE_BARRIER();
for (int32_t i = 0; i < cores_total_num_; i++) {
all_handshakes[i].aicpu_ready = 1;
}
OUT_OF_ORDER_STORE_BARRIER();

// Get platform physical cores count for validation
uint32_t max_physical_cores_count = platform_get_physical_cores_count();

// Step 2: Wait for all cores to respond, collect core type and register addresses
// Step 2: collect responses from all cores. The 72 AICore cores wake and
// advance their handshake phases in parallel, so we sweep — poll every
// outstanding core per pass and service whichever are ready — rather than
// blocking on core i before looking at core i+1. A per-core blocking loop
// serializes the wakeups (Σ per-core latency); sweeping overlaps them
// (≈ max per-core latency + one drain of the GM-flag polls). The flags are
// GM reads (not the nGnRE MMIO reg window), so the polls are not forced
// serial the way RegId::COND polling is.
bool handshake_failed = false;
for (int32_t i = 0; i < cores_total_num_; i++) {
Handshake *hank = &all_handshakes[i];

while (hank->aicore_regs_ready == 0) {
SPIN_WAIT_HINT();
}

uint32_t physical_core_id = hank->physical_core_id;

if (physical_core_id >= max_physical_cores_count) {
LOG_ERROR(
"Core %d reported invalid physical_core_id=%u (platform max=%u)", i, physical_core_id,
max_physical_cores_count
);
handshake_failed = true;
continue;
}

uint64_t *regs = reinterpret_cast<uint64_t *>(regs_);
uint64_t reg_addr = regs[physical_core_id];

// Initialize AICore registers after discovery (first round)
platform_init_aicore_regs(reg_addr);
OUT_OF_ORDER_STORE_BARRIER();
hank->aicpu_regs_ready = 1;

OUT_OF_ORDER_STORE_BARRIER();

while (hank->aicore_done == 0) {
SPIN_WAIT_HINT();
}

CoreType type = hank->core_type;
uint64_t *regs = reinterpret_cast<uint64_t *>(regs_);
bool regs_phase_done[RUNTIME_MAX_WORKER] = {false};
uint64_t reg_addr_of[RUNTIME_MAX_WORKER] = {0};

// Sweep A: wait for aicore_regs_ready, init that core's regs, ack with
// aicpu_regs_ready=1. Servicing a ready core (regs init + ack) carries no
// cross-core dependency, so it is done in-pass while other cores are still
// waking.
for (int32_t remaining = cores_total_num_; remaining > 0;) {
for (int32_t i = 0; i < cores_total_num_; i++) {
if (regs_phase_done[i]) continue;
Handshake *hank = &all_handshakes[i];
if (hank->aicore_regs_ready == 0) {
SPIN_WAIT_HINT();
continue;
}

core_exec_states_[i].reg_addr = reg_addr;
core_exec_states_[i].cond_ptr = get_reg_ptr(reg_addr, RegId::COND);
uint32_t physical_core_id = hank->physical_core_id;
if (physical_core_id >= max_physical_cores_count) {
LOG_ERROR(
"Core %d reported invalid physical_core_id=%u (platform max=%u)", i, physical_core_id,
max_physical_cores_count
);
handshake_failed = true;
regs_phase_done[i] = true;
remaining--;
continue;
}

uint64_t reg_addr = regs[physical_core_id];
reg_addr_of[i] = reg_addr;
platform_init_aicore_regs(reg_addr);
OUT_OF_ORDER_STORE_BARRIER();
hank->aicpu_regs_ready = 1;
Comment on lines +762 to +766

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Correctness Bug: Uninitialized core_exec_states_[i].reg_addr on Early Exit

By splitting the handshake into Sweep A and Sweep B, core_exec_states_[i].reg_addr is only populated during Sweep B. However, if any core fails validation (e.g., invalid physical_core_id) during Sweep A, handshake_failed is set to true, and the function exits early before Sweep B is ever executed.

This triggers emergency_shutdown(), which relies on core_exec_states_[i].reg_addr != 0 to deinitialize the registers of already-initialized cores. Since core_exec_states_[i].reg_addr remains 0 for all cores, emergency_shutdown() fails to deinitialize any registers, leaving them in an active/initialized state.

Populating core_exec_states_[i].reg_addr immediately after initialization in Sweep A ensures that emergency_shutdown() can correctly clean up resources on failure.

Additionally, if a dependent kernel launch fails after another kernel (such as the AICore worker) has already been launched and is waiting (spinning in a handshake), ensure that the device is recovered or marked unusable (e.g., by calling recover_device_or_mark_unusable) before returning to prevent the orphaned kernel from spinning until timeout and poisoning the device context.

            uint64_t reg_addr = regs[physical_core_id];
            reg_addr_of[i] = reg_addr;
            core_exec_states_[i].reg_addr = reg_addr;
            platform_init_aicore_regs(reg_addr);
            OUT_OF_ORDER_STORE_BARRIER();
            hank->aicpu_regs_ready = 1;
References
  1. If a dependent kernel launch fails after another kernel has already been launched and is waiting, ensure that the device is recovered or marked unusable (e.g., by calling recover_device_or_mark_unusable) before returning to prevent the orphaned kernel from spinning and poisoning the device context.

#if PTO2_PROFILING
// Record physical_core_id for PMU init later (CoreExecState has no room
// for this field under PTO2_PROFILING).
physical_core_ids_[i] = physical_core_id;
physical_core_ids_[i] = physical_core_id;
#endif
#if !PTO2_PROFILING
core_exec_states_[i].worker_id = i;
core_exec_states_[i].physical_core_id = physical_core_id;
core_exec_states_[i].core_type = type;
core_exec_states_[i].physical_core_id = physical_core_id;
#endif

if (type == CoreType::AIC) {
aic_worker_ids_[aic_count_++] = i;
LOG_INFO_V0("Core %d: AIC, physical_id=%u, reg_addr=0x%lx", i, physical_core_id, reg_addr);
} else {
aiv_worker_ids_[aiv_count_++] = i;
LOG_INFO_V0("Core %d: AIV, physical_id=%u, reg_addr=0x%lx", i, physical_core_id, reg_addr);
regs_phase_done[i] = true;
remaining--;
}
}
OUT_OF_ORDER_STORE_BARRIER();

if (handshake_failed) {
emergency_shutdown(runtime);
return -1;
Comment on lines +762 to 781

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify a2a3 emergency_shutdown can clean regs initialized during Sweep A.
rg -n -C4 'SchedulerContext::emergency_shutdown|platform_deinit_aicore_regs|core_exec_states_\[[^]]+\]\.reg_addr|reg_addr_of' \
  src/a2a3/runtime/tensormap_and_ringbuffer/runtime/scheduler/scheduler_cold_path.cpp

Repository: hw-native-sys/simpler

Length of output: 4635


🏁 Script executed:

#!/bin/bash
sed -n '720,820p' src/a2a3/runtime/tensormap_and_ringbuffer/runtime/scheduler/scheduler_cold_path.cpp
printf '\n----\n'
sed -n '954,980p' src/a2a3/runtime/tensormap_and_ringbuffer/runtime/scheduler/scheduler_cold_path.cpp

Repository: hw-native-sys/simpler

Length of output: 5525


🏁 Script executed:

rg -n 'core_exec_states_\[[^]]+\]\.reg_addr\s*=|reg_addr\s*=' src/a2a3/runtime/tensormap_and_ringbuffer/runtime/scheduler/scheduler_cold_path.cpp

Repository: hw-native-sys/simpler

Length of output: 554


Persist reg_addr before the failure return (src/a2a3/runtime/tensormap_and_ringbuffer/runtime/scheduler/scheduler_cold_path.cpp:762-798, 958-967) emergency_shutdown() only deinitializes via core_exec_states_[i].reg_addr, which is populated in Sweep B; if Sweep A bails on an invalid physical_core_id, any regs already initialized there are skipped.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/a2a3/runtime/tensormap_and_ringbuffer/runtime/scheduler/scheduler_cold_path.cpp`
around lines 762 - 781, The failure path in the scheduler cold-path
initialization leaves some initialized register addresses untracked because
`reg_addr` is only stored in Sweep B, so `emergency_shutdown()` can miss regs
set up before an invalid `physical_core_id` triggers `handshake_failed`. Update
the `scheduler_cold_path.cpp` flow around the `reg_addr` initialization and
`emergency_shutdown()` handling so every successfully initialized `reg_addr` is
persisted immediately in `core_exec_states_[i].reg_addr` before any possible
early return, ensuring shutdown can deinitialize all regs consistently.

}

// Sweep B: wait for aicore_done, latch core type + register pointers. Same
// sweep so the second round-trip's wakeups also overlap.
bool done_phase_done[RUNTIME_MAX_WORKER] = {false};
for (int32_t remaining = cores_total_num_; remaining > 0;) {
for (int32_t i = 0; i < cores_total_num_; i++) {
if (done_phase_done[i]) continue;
Handshake *hank = &all_handshakes[i];
if (hank->aicore_done == 0) {
SPIN_WAIT_HINT();
continue;
}

CoreType type = hank->core_type;
uint64_t reg_addr = reg_addr_of[i];
core_exec_states_[i].reg_addr = reg_addr;
core_exec_states_[i].cond_ptr = get_reg_ptr(reg_addr, RegId::COND);
Comment on lines +796 to +799

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Redundant Assignment Cleanup

Since core_exec_states_[i].reg_addr is now populated during Sweep A to ensure correct cleanup in emergency_shutdown(), the redundant assignment in Sweep B can be removed.

            CoreType type = hank->core_type;
            uint64_t reg_addr = reg_addr_of[i];
            core_exec_states_[i].cond_ptr = get_reg_ptr(reg_addr, RegId::COND);

#if !PTO2_PROFILING
core_exec_states_[i].worker_id = i;
core_exec_states_[i].core_type = type;
#endif
if (type == CoreType::AIC) {
aic_worker_ids_[aic_count_++] = i;
LOG_INFO_V0("Core %d: AIC, reg_addr=0x%lx", i, reg_addr);
} else {
aiv_worker_ids_[aiv_count_++] = i;
LOG_INFO_V0("Core %d: AIV, reg_addr=0x%lx", i, reg_addr);
}
done_phase_done[i] = true;
remaining--;
}
}

LOG_INFO_V0("Core discovery complete: %d AIC, %d AIV", aic_count_, aiv_count_);
return 0;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -707,82 +707,116 @@ int32_t SchedulerContext::handshake_all_cores(Runtime *runtime) {

LOG_INFO_V0("Handshaking with %d cores", cores_total_num_);

// Step 1: Write per-core payload addresses and send handshake signal.
// OUT_OF_ORDER_STORE_BARRIER() ensures task is globally visible before
// aicpu_ready=1, so AICore reads the correct payload pointer after waking up.
// Step 1: Write per-core payload addresses, then release all cores. The
// task pointers are written first and published with a single barrier, then
// aicpu_ready is raised for every core. One barrier (not one per core)
// suffices: the barrier guarantees every task store is globally visible
// before any aicpu_ready store, which is the only ordering AICore relies on
// (it reads task only after observing aicpu_ready==1).
for (int32_t i = 0; i < cores_total_num_; i++) {
all_handshakes[i].task = reinterpret_cast<uint64_t>(&payload_per_core_[i][0]);
OUT_OF_ORDER_STORE_BARRIER();
}
OUT_OF_ORDER_STORE_BARRIER();
for (int32_t i = 0; i < cores_total_num_; i++) {
all_handshakes[i].aicpu_ready = 1;
}
OUT_OF_ORDER_STORE_BARRIER();

// Get platform physical cores count for validation
uint32_t max_physical_cores_count = platform_get_physical_cores_count();

// Step 2: Wait for all cores to respond, collect core type and register addresses
// Step 2: collect responses from all cores. The AICore cores wake and
// advance their handshake phases in parallel, so we sweep — poll every
// outstanding core per pass and service whichever are ready — rather than
// blocking on core i before looking at core i+1. A per-core blocking loop
// serializes the wakeups (Σ per-core latency); sweeping overlaps them
// (≈ max per-core latency + one drain of the GM-flag polls). The flags are
// GM reads (not the nGnRE MMIO reg window), so the polls are not forced
// serial the way RegId::COND polling is.
bool handshake_failed = false;
for (int32_t i = 0; i < cores_total_num_; i++) {
Handshake *hank = &all_handshakes[i];

while (hank->aicore_regs_ready == 0) {
SPIN_WAIT_HINT();
}

uint32_t physical_core_id = hank->physical_core_id;

if (physical_core_id >= max_physical_cores_count) {
LOG_ERROR(
"Core %d reported invalid physical_core_id=%u (platform max=%u)", i, physical_core_id,
max_physical_cores_count
);
handshake_failed = true;
continue;
}

uint64_t *regs = reinterpret_cast<uint64_t *>(regs_);
uint64_t reg_addr = regs[physical_core_id];

// Initialize AICore registers after discovery (first round)
platform_init_aicore_regs(reg_addr);
OUT_OF_ORDER_STORE_BARRIER();
hank->aicpu_regs_ready = 1;

OUT_OF_ORDER_STORE_BARRIER();

while (hank->aicore_done == 0) {
SPIN_WAIT_HINT();
}

CoreType type = hank->core_type;
uint64_t *regs = reinterpret_cast<uint64_t *>(regs_);
bool regs_phase_done[RUNTIME_MAX_WORKER] = {false};
uint64_t reg_addr_of[RUNTIME_MAX_WORKER] = {0};

// Sweep A: wait for aicore_regs_ready, init that core's regs, ack with
// aicpu_regs_ready=1. Servicing a ready core (regs init + ack) carries no
// cross-core dependency, so it is done in-pass while other cores are still
// waking.
for (int32_t remaining = cores_total_num_; remaining > 0;) {
for (int32_t i = 0; i < cores_total_num_; i++) {
if (regs_phase_done[i]) continue;
Handshake *hank = &all_handshakes[i];
if (hank->aicore_regs_ready == 0) {
SPIN_WAIT_HINT();
continue;
}

core_exec_states_[i].reg_addr = reg_addr;
core_exec_states_[i].cond_ptr = get_reg_ptr(reg_addr, RegId::COND);
uint32_t physical_core_id = hank->physical_core_id;
if (physical_core_id >= max_physical_cores_count) {
LOG_ERROR(
"Core %d reported invalid physical_core_id=%u (platform max=%u)", i, physical_core_id,
max_physical_cores_count
);
handshake_failed = true;
regs_phase_done[i] = true;
remaining--;
continue;
}

uint64_t reg_addr = regs[physical_core_id];
reg_addr_of[i] = reg_addr;
platform_init_aicore_regs(reg_addr);
OUT_OF_ORDER_STORE_BARRIER();
hank->aicpu_regs_ready = 1;
Comment on lines +766 to +770

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Correctness Bug: Uninitialized core_exec_states_[i].reg_addr on Early Exit

By splitting the handshake into Sweep A and Sweep B, core_exec_states_[i].reg_addr is only populated during Sweep B. However, if any core fails validation (e.g., invalid physical_core_id) during Sweep A, handshake_failed is set to true, and the function exits early before Sweep B is ever executed.

This triggers emergency_shutdown(), which relies on core_exec_states_[i].reg_addr != 0 to deinitialize the registers of already-initialized cores. Since core_exec_states_[i].reg_addr remains 0 for all cores, emergency_shutdown() fails to deinitialize any registers, leaving them in an active/initialized state.

Populating core_exec_states_[i].reg_addr immediately after initialization in Sweep A ensures that emergency_shutdown() can correctly clean up resources on failure.

Additionally, if a dependent kernel launch fails after another kernel (such as the AICore worker) has already been launched and is waiting (spinning in a handshake), ensure that the device is recovered or marked unusable (e.g., by calling recover_device_or_mark_unusable) before returning to prevent the orphaned kernel from spinning until timeout and poisoning the device context.

            uint64_t reg_addr = regs[physical_core_id];
            reg_addr_of[i] = reg_addr;
            core_exec_states_[i].reg_addr = reg_addr;
            platform_init_aicore_regs(reg_addr);
            OUT_OF_ORDER_STORE_BARRIER();
            hank->aicpu_regs_ready = 1;
References
  1. If a dependent kernel launch fails after another kernel has already been launched and is waiting, ensure that the device is recovered or marked unusable (e.g., by calling recover_device_or_mark_unusable) before returning to prevent the orphaned kernel from spinning and poisoning the device context.

#if PTO2_PROFILING
physical_core_ids_[i] = physical_core_id;
physical_core_ids_[i] = physical_core_id;
#endif

#if !PTO2_PROFILING
core_exec_states_[i].worker_id = i;
core_exec_states_[i].physical_core_id = physical_core_id;
core_exec_states_[i].core_type = type;
core_exec_states_[i].physical_core_id = physical_core_id;
#endif

if (type == CoreType::AIC) {
aic_worker_ids_[aic_count_++] = i;
LOG_INFO_V0("Core %d: AIC, physical_id=%u, reg_addr=0x%lx", i, physical_core_id, reg_addr);
} else {
aiv_worker_ids_[aiv_count_++] = i;
LOG_INFO_V0("Core %d: AIV, physical_id=%u, reg_addr=0x%lx", i, physical_core_id, reg_addr);
regs_phase_done[i] = true;
remaining--;
}
}
OUT_OF_ORDER_STORE_BARRIER();

if (handshake_failed) {
emergency_shutdown(runtime);
return -1;
Comment on lines +766 to 785

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Record initialized reg addresses before the failure path.

platform_init_aicore_regs(reg_addr) can run in Sweep A, but core_exec_states_[i].reg_addr is only assigned in Sweep B. If any later core reports an invalid physical ID, Line 784 calls emergency_shutdown, whose Line 970 deinit check skips these newly initialized regs or may use stale reg addrs from a previous run.

Proposed fix
     bool regs_phase_done[RUNTIME_MAX_WORKER] = {false};
     uint64_t reg_addr_of[RUNTIME_MAX_WORKER] = {0};
+    for (int32_t i = 0; i < cores_total_num_; i++) {
+        core_exec_states_[i].reg_addr = 0;
+    }
 
@@
             uint64_t reg_addr = regs[physical_core_id];
             reg_addr_of[i] = reg_addr;
             platform_init_aicore_regs(reg_addr);
+            core_exec_states_[i].reg_addr = reg_addr;
             OUT_OF_ORDER_STORE_BARRIER();
             hank->aicpu_regs_ready = 1;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
uint64_t reg_addr = regs[physical_core_id];
reg_addr_of[i] = reg_addr;
platform_init_aicore_regs(reg_addr);
OUT_OF_ORDER_STORE_BARRIER();
hank->aicpu_regs_ready = 1;
#if PTO2_PROFILING
physical_core_ids_[i] = physical_core_id;
physical_core_ids_[i] = physical_core_id;
#endif
#if !PTO2_PROFILING
core_exec_states_[i].worker_id = i;
core_exec_states_[i].physical_core_id = physical_core_id;
core_exec_states_[i].core_type = type;
core_exec_states_[i].physical_core_id = physical_core_id;
#endif
if (type == CoreType::AIC) {
aic_worker_ids_[aic_count_++] = i;
LOG_INFO_V0("Core %d: AIC, physical_id=%u, reg_addr=0x%lx", i, physical_core_id, reg_addr);
} else {
aiv_worker_ids_[aiv_count_++] = i;
LOG_INFO_V0("Core %d: AIV, physical_id=%u, reg_addr=0x%lx", i, physical_core_id, reg_addr);
regs_phase_done[i] = true;
remaining--;
}
}
OUT_OF_ORDER_STORE_BARRIER();
if (handshake_failed) {
emergency_shutdown(runtime);
return -1;
bool regs_phase_done[RUNTIME_MAX_WORKER] = {false};
uint64_t reg_addr_of[RUNTIME_MAX_WORKER] = {0};
for (int32_t i = 0; i < cores_total_num_; i++) {
core_exec_states_[i].reg_addr = 0;
}
uint64_t reg_addr = regs[physical_core_id];
reg_addr_of[i] = reg_addr;
platform_init_aicore_regs(reg_addr);
core_exec_states_[i].reg_addr = reg_addr;
OUT_OF_ORDER_STORE_BARRIER();
hank->aicpu_regs_ready = 1;
`#if` PTO2_PROFILING
physical_core_ids_[i] = physical_core_id;
`#endif`
`#if` !PTO2_PROFILING
core_exec_states_[i].physical_core_id = physical_core_id;
`#endif`
regs_phase_done[i] = true;
remaining--;
}
}
OUT_OF_ORDER_STORE_BARRIER();
if (handshake_failed) {
emergency_shutdown(runtime);
return -1;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/a5/runtime/tensormap_and_ringbuffer/runtime/scheduler/scheduler_cold_path.cpp`
around lines 766 - 785, The reg address for each initialized core is being
recorded too late, so the failure path in `scheduler_cold_path.cpp` can miss or
reuse stale state after `platform_init_aicore_regs(reg_addr)` has already run.
Move the assignment of the initialized reg address into the same Sweep A path
where `platform_init_aicore_regs` and `hank->aicpu_regs_ready` are set, using
`core_exec_states_[i].reg_addr` so `emergency_shutdown` and the later deinit
logic can see the correct value. Keep the update paired with the existing core
state writes in the loop that handles `physical_core_id`, `reg_addr_of`, and
`regs_phase_done`.

}

// Sweep B: wait for aicore_done, latch core type + register pointers. Same
// sweep so the second round-trip's wakeups also overlap.
bool done_phase_done[RUNTIME_MAX_WORKER] = {false};
for (int32_t remaining = cores_total_num_; remaining > 0;) {
for (int32_t i = 0; i < cores_total_num_; i++) {
if (done_phase_done[i]) continue;
Handshake *hank = &all_handshakes[i];
if (hank->aicore_done == 0) {
SPIN_WAIT_HINT();
continue;
}

CoreType type = hank->core_type;
uint64_t reg_addr = reg_addr_of[i];
core_exec_states_[i].reg_addr = reg_addr;
core_exec_states_[i].cond_ptr = get_reg_ptr(reg_addr, RegId::COND);
Comment on lines +800 to +803

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Redundant Assignment Cleanup

Since core_exec_states_[i].reg_addr is now populated during Sweep A to ensure correct cleanup in emergency_shutdown(), the redundant assignment in Sweep B can be removed.

            CoreType type = hank->core_type;
            uint64_t reg_addr = reg_addr_of[i];
            core_exec_states_[i].cond_ptr = get_reg_ptr(reg_addr, RegId::COND);

#if !PTO2_PROFILING
core_exec_states_[i].worker_id = i;
core_exec_states_[i].core_type = type;
#endif
if (type == CoreType::AIC) {
aic_worker_ids_[aic_count_++] = i;
LOG_INFO_V0("Core %d: AIC, reg_addr=0x%lx", i, reg_addr);
} else {
aiv_worker_ids_[aiv_count_++] = i;
LOG_INFO_V0("Core %d: AIV, reg_addr=0x%lx", i, reg_addr);
}
done_phase_done[i] = true;
remaining--;
}
}

LOG_INFO_V0("Core discovery complete: %d AIC, %d AIV", aic_count_, aiv_count_);
return 0;
}
Expand Down
Loading