feat: surface execution observations in task run + add axis observations CLI#139
Conversation
…ons CLI Post-execution measurement infrastructure already records peak RAM, wall time, and success status to state.json via RecordObservation. This change makes that data visible to operators. Changes: - Add PeakRAMMB, PeakVRAMMB, WallTimeMS to GuardedExecutionResult - Populate fields in runLocal/runRemote success and failure paths - Mirror fields in RunResponse for API serialization - Add formatObservationSummary helper; print after task run - Add axis observations command with list and inspect subcommands - Read from local state.json; show table, JSON, or detailed view - Add 10 unit tests for the observations CLI surface Quality gates: lint, test-race, coverage (72.5%), build all pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a new observations command to the axis CLI, enabling users to list and inspect execution performance metrics such as wall time, RAM, and VRAM usage. It also updates the task execution logic to capture and display these metrics upon completion. Key feedback includes optimizing map lookups for efficiency, addressing non-deterministic behavior in prefix-based lookups caused by Go's map iteration order, and refining table formatting and code readability.
| for k, obs := range st.Observations { | ||
| if k == key { | ||
| obsCopy := obs | ||
| found = &obsCopy | ||
| break | ||
| } | ||
| } |
| for k, obs := range st.Observations { | ||
| if strings.HasPrefix(k, key) { | ||
| obsCopy := obs | ||
| found = &obsCopy | ||
| break | ||
| } | ||
| } |
There was a problem hiding this comment.
Iterating over a map to find a prefix match is non-deterministic in Go because map iteration order is randomized. If multiple keys share the same prefix, the command will return a different observation each time it is run. Consider collecting all matches and returning an error if more than one is found, or sorting the keys before matching to ensure determinism.
| if !state.ObservationIsFresh(*found, time.Now().UTC()) { | ||
| isStale = " (stale)" | ||
| } | ||
| fmt.Fprintf(out, "Fresh: %v%s\n", state.ObservationIsFresh(*found, time.Now().UTC()), isStale) |
There was a problem hiding this comment.
The function state.ObservationIsFresh is called twice with the same arguments. You can reuse the result of the first call to improve readability and slightly reduce overhead.
isFresh := state.ObservationIsFresh(*found, time.Now().UTC())
isStale := ""
if !isFresh {
isStale = " (stale)"
}
fmt.Fprintf(out, "Fresh: %v%s\n", isFresh, isStale)| ui.WhiteColor.Fprintf(&b, " %-15s %-12s %-12s %-12s %10s %10s %8s %8s\n", | ||
| "NODE", "WORKLOAD", "BACKEND", "TOOL", "WALL MS", "PEAK RAM", "PEAK VRAM", "SAMPLES") |
There was a problem hiding this comment.
The header PEAK VRAM is 9 characters long, but the format specifier %8s is used. This will cause the column to be at least 9 characters wide anyway, potentially misaligning it with the data rows if they strictly follow the 8-character width. Additionally, for clarity, it would be beneficial to include units (MB) in the PEAK RAM header or values, similar to how VRAM is handled. Furthermore, per repository guidelines, if this list of observations can become excessively long, it should be truncated with an indication of the remaining count.
References
- When displaying a list of items in a user-facing message (e.g., warnings, logs), truncate the list if it can become excessively long and indicate that it has been truncated, for example by showing a count of the remaining items.
Post-execution measurement infrastructure already records peak RAM, wall time, and success status to state.json via RecordObservation. This change makes that data visible to operators.
Changes
Quality Gates