Skip to content

docs(runbook): EventedPLEG behavior and updated failure signatures (PGM-201)#28

Merged
pgmac merged 1 commit into
mainfrom
paulymac/pgm-201-investigate-kinedqlite-latency-spikes-that-trigger-cni-add
May 23, 2026
Merged

docs(runbook): EventedPLEG behavior and updated failure signatures (PGM-201)#28
pgmac merged 1 commit into
mainfrom
paulymac/pgm-201-investigate-kinedqlite-latency-spikes-that-trigger-cni-add

Conversation

@pgmac
Copy link
Copy Markdown
Contributor

@pgmac pgmac commented May 23, 2026

Summary

  • Documents how EventedPLEG (active on all pvek8s nodes, k8s 1.35) eliminates the Generic PLEG deadlock risk from orphaned shims and kine write contention
  • Adds a new EventedPLEG Behavior section to kubelet-silent-stall.md with architecture explanation, failure signatures, residual risk table, and monitoring commands
  • Updates References to include PGM-201

Key finding

Zero actual PLEG stalls in 7 days with EventedPLEG active despite 8,000+ dqlite "database is locked" events. EventedPLEG's push-based CRI event stream and ListPodSandboxes resync (no per-pod ContainerStatus calls) break the deadlock chain at the critical point.

Test plan

  • Validate workflow passes (MkDocs strict build)
  • kubelet-silent-stall.md renders correctly with new section and admonition
  • Anchor links resolve: #eventedpleg-behavior-k8s-135

🤖 Generated with Claude Code

…atures (PGM-201)

Add EventedPLEG Behavior section to kubelet-silent-stall.md explaining how k8s 1.35's
push-based CRI event model eliminates the Generic PLEG deadlock risk from orphaned shims
and kine write contention, and documents the new (rarer) failure signature to watch for.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@pgmac pgmac merged commit 96c431f into main May 23, 2026
1 check passed
@pgmac pgmac deleted the paulymac/pgm-201-investigate-kinedqlite-latency-spikes-that-trigger-cni-add branch May 23, 2026 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant