You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've tried to run two Ralph sessions overnight and they've both been killed by out of memory by my VPS.
Have you had this experience?
Ralph was killed after using 10GB of memory. The OpenCode server had about 2.4GB. 10GB memory for an app that's basically a loop on another app seems excessive.
I've recently set up this VPS, so it's possible I'm doing something wrong and I'm considering a beefier machine next time with 32GB but it feels a bit unecessary seeing as my mac is also ARM and works fine on 16GB. Although I also don't have a swap file on the VPS - which I'm going to add. But I feel as though swap files don't solve potential memory leaks.
I got Claude to have a look at both the server and the source code and interrogated it a bit questioning it's reasoning etc. and it thinks there might be a memory leak. At this point I'm thinking that I'll experiment with just doing a bash loop so I'll leave you with results from Claude below. I hope this isn't falling foul of your recent blog post but I thought I'd share nonetheless.
👈 Memory Leak Investigation: Ralph Agent Process
Memory Leak Investigation: Ralph Agent Process
Incident Context
What Happened
The ralph agent process was killed by the Linux OOM (Out of Memory) killer after consuming excessive memory during a long-running session.
System: Hetzner CAX33 (16 GB RAM, ARM64 architecture, no swap)
Pattern: Second OOM kill - previous one on Jan 7 consumed 6 GB after unknown runtime
Memory Growth Pattern
The ralph process started at normal baseline and grew from ~100 MB to 10.3 GB over approximately 2 hours. The massive discrepancy between physical (10.3 GB) and virtual (102 GB) memory suggests significant memory fragmentation and allocation churn.
System Context
Long-running agent session working overnight
Multiple other processes running concurrently:
OpenCode server instances (~2.4 GB combined)
Additional opencode instances (~1.2 GB each)
next-server (2.2 GB)
Agent operating in tmux session
No swap space available, so OOM killer acted immediately when RAM exhausted
Critical Evidence: The Leak is Client-Side
The evidence clearly points to ralph's client-side memory accumulation, not the OpenCode server:
10.3 GB in ralph process - Memory attributed directly to ralph's process space
OpenCode servers only 2.4 GB combined - If sessions were accumulating in the server, these processes would be much larger
102 GB virtual memory in ralph - Massive virtual allocation suggests ralph itself is doing heavy buffering/allocation with significant fragmentation
Root Cause Analysis
🔴 PRIMARY: SSE Event Stream Buffering with Full Tool Outputs
Location:src/loop.ts:200-268
Every SSE event from OpenCode contains the complete tool output data:
exporttypeToolStateCompleted={status: "completed";input: {[key: string]: unknown};output: string;// ⚠️ FULL OUTPUT (can be megabytes per tool call)title: string;attachments?: Array<FilePart>;// ⚠️ FILE ATTACHMENTS (can be huge)metadata: {[key: string]: unknown};time: {start: number;end: number;compacted?: number};};
The Problem:
Ralph subscribes to the global event stream:
constevents=awaitclient.event.subscribe();
Ralph consumes events but only extracts minimal data for display:
forawait(consteventofevents.stream){constpart=event.properties.part;// Ralph only stores toolName and title for UI displaycallbacks.onEvent({icon: toolName,text: part.state.title||JSON.stringify(part.state.input),// ⚠️ NOT using part.state.output});}
However: The full event objects remain in memory because:
The AsyncGenerator maintains references to yielded values
The stream is never explicitly closed
Each event contains the full tool output (file reads, grep results, web fetches)
Ralph subscribes to ALL events globally, not just for its session
Memory Math:
50 iterations over 2 hours
~20 tool calls per iteration = 1,000 tool calls total
Average tool output: 10 MB (large file reads, grep results, API responses)
Total: 10 GB ✅
The 102 GB virtual memory comes from JavaScript string immutability causing allocation churn and fragmentation.
🟡 SECONDARY: Sessions Never Deleted
Location:src/loop.ts:189-196
Each iteration creates a new OpenCode session but never cleans it up:
constsessionResult=awaitclient.session.create();constsessionId=sessionResult.data.id;// ... iteration runs ...// ❌ Session is never deleted
While the primary leak (10.3 GB) is in ralph's client-side code, this contributes to the 2.4 GB accumulated in OpenCode server processes. The SDK provides client.session.delete() but it's never called.
🟢 LOW PRIORITY: Other Potential Accumulation Points
Event Array Management
Location:src/state.ts:32-54, src/index.ts:409
Events are properly limited to MAX_EVENTS = 200 via trimEventsInPlace(), preventing unbounded growth. However:
Array is mutated in-place with splice() and push()
Can cause minor fragmentation over time
Each ToolEvent is a small object (~200 bytes)
Impact: Low - the 200 event limit prevents this from being significant.
Batch State Updater
Location:src/index.ts:46-116
The pendingUpdates array accumulates state update closures, but is properly cleared on flush (line 65). The closures capture LoopState including the events array, which could cause temporary memory spikes if batches grow large, but this is flushed every 100ms.
Location:src/index.ts after loop starts (around line 378)
import{startMemoryLogging}from"./util/log.js";// After runLoop() is calledstartMemoryLogging(30000);// Log memory every 30 seconds
Why: Provides visibility into memory growth patterns for debugging and verification. The logMemory() function already exists in src/util/log.ts:95-104 but isn't being called.
Fix #4: Force Garbage Collection Between Iterations (Optional)
Location:src/loop.ts after session cleanup, before starting next iteration
// After session deletion, before next iterationif(global.gc){global.gc();log("loop","Forced GC after iteration",{ iteration });}
Run ralph with: bun --expose-gc src/index.ts
Why: Proactively releases memory between iterations rather than waiting for automatic GC.
I've tried to run two Ralph sessions overnight and they've both been killed by out of memory by my VPS.
Have you had this experience?
Ralph was killed after using
10GBof memory. The OpenCode server had about2.4GB.10GBmemory for an app that's basically a loop on another app seems excessive.I've recently set up this VPS, so it's possible I'm doing something wrong and I'm considering a beefier machine next time with
32GBbut it feels a bit unecessary seeing as my mac is also ARM and works fine on 16GB. Although I also don't have a swap file on the VPS - which I'm going to add. But I feel as though swap files don't solve potential memory leaks.I got Claude to have a look at both the server and the source code and interrogated it a bit questioning it's reasoning etc. and it thinks there might be a memory leak. At this point I'm thinking that I'll experiment with just doing a bash loop so I'll leave you with results from Claude below. I hope this isn't falling foul of your recent blog post but I thought I'd share nonetheless.
👈 Memory Leak Investigation: Ralph Agent Process
Memory Leak Investigation: Ralph Agent Process
Incident Context
What Happened
The ralph agent process was killed by the Linux OOM (Out of Memory) killer after consuming excessive memory during a long-running session.
Key Facts:
Memory Growth Pattern
The ralph process started at normal baseline and grew from ~100 MB to 10.3 GB over approximately 2 hours. The massive discrepancy between physical (10.3 GB) and virtual (102 GB) memory suggests significant memory fragmentation and allocation churn.
System Context
Critical Evidence: The Leak is Client-Side
The evidence clearly points to ralph's client-side memory accumulation, not the OpenCode server:
Root Cause Analysis
🔴 PRIMARY: SSE Event Stream Buffering with Full Tool Outputs
Location:
src/loop.ts:200-268Every SSE event from OpenCode contains the complete tool output data:
The Problem:
Ralph subscribes to the global event stream:
Ralph consumes events but only extracts minimal data for display:
However: The full
eventobjects remain in memory because:Memory Math:
The 102 GB virtual memory comes from JavaScript string immutability causing allocation churn and fragmentation.
🟡 SECONDARY: Sessions Never Deleted
Location:
src/loop.ts:189-196Each iteration creates a new OpenCode session but never cleans it up:
While the primary leak (10.3 GB) is in ralph's client-side code, this contributes to the 2.4 GB accumulated in OpenCode server processes. The SDK provides
client.session.delete()but it's never called.🟢 LOW PRIORITY: Other Potential Accumulation Points
Event Array Management
Location:
src/state.ts:32-54, src/index.ts:409Events are properly limited to
MAX_EVENTS = 200viatrimEventsInPlace(), preventing unbounded growth. However:splice()andpush()ToolEventis a small object (~200 bytes)Impact: Low - the 200 event limit prevents this from being significant.
Batch State Updater
Location:
src/index.ts:46-116The
pendingUpdatesarray accumulates state update closures, but is properly cleared on flush (line 65). The closures captureLoopStateincluding the events array, which could cause temporary memory spikes if batches grow large, but this is flushed every 100ms.Impact: Low - proper cleanup occurs on flush.
Recommended Fixes
Fix #1: Close SSE Stream Properly ⭐ CRITICAL
Location:
src/loop.tsafter line 268 (after breaking from event loop)Why: Explicitly closing the AsyncGenerator allows garbage collection of accumulated event objects.
Fix #2: Delete Sessions After Each Iteration
Location:
src/loop.tsaround line 300 (in finally block or after iteration completes)Why: Prevents session accumulation in OpenCode server (contributes to 2.4 GB server-side growth).
Fix #3: Add Memory Monitoring
Location:
src/index.tsafter loop starts (around line 378)Why: Provides visibility into memory growth patterns for debugging and verification. The
logMemory()function already exists insrc/util/log.ts:95-104but isn't being called.Fix #4: Force Garbage Collection Between Iterations (Optional)
Location:
src/loop.tsafter session cleanup, before starting next iterationRun ralph with:
bun --expose-gc src/index.tsWhy: Proactively releases memory between iterations rather than waiting for automatic GC.
Verification Strategy
Confirm the Fix Works
Detailed Profiling (If Issues Persist)
Check if OpenCode supports event filtering by session
Investigate OpenCode's compaction feature
ToolStateCompletedhastime.compactedfieldUse Bun heap snapshots
Take snapshots at start, middle, and end of long runs to see exact allocation sites.
Monitor OpenCode server independently
Questions for Investigation
Does the OpenCode SDK's event subscription support filtering by
sessionIDto avoid receiving events from other sessions?Can we subscribe to events per session rather than globally to avoid the accumulation?
Does OpenCode support automatic compaction of large tool outputs, and if so, how do we enable it?
Are there any known issues with AsyncGenerator memory retention in Bun's JavaScript engine?
Should ralph maintain its own cache of recent tool outputs, or rely entirely on fetching from OpenCode on-demand?
Impact Assessment
Severity: High - Causes process termination after ~2 hours in production environments
Affected Users: Anyone running ralph for extended periods (multi-hour sessions)
Workaround: Manually restart ralph every 1-2 hours, or increase system RAM (not sustainable)
Risk of Regression: Medium - Fix #1 (stream closure) is a simple addition, but need to verify AsyncGenerator cleanup behavior
Timeline
Additional Context
System Specifications
Ralph Configuration
opencode/claude-opus-4-5plan.md