diff --git a/lldb/agent-harness/HARNESS.md b/lldb/agent-harness/HARNESS.md index ee36a697a..fdd21a539 100644 --- a/lldb/agent-harness/HARNESS.md +++ b/lldb/agent-harness/HARNESS.md @@ -2,13 +2,16 @@ ## Overview -This harness wraps the **LLDB Python API** into a Click-based CLI tool: -`cli-anything-lldb`. +This harness wraps the **LLDB Python API** into a Click-based CLI tool and +debug adapter: +- `cli-anything-lldb` for JSON CLI / REPL workflows +- `cli-anything-lldb-dap` for stdio Debug Adapter Protocol clients It provides stateful debugging workflows for agent and script usage, with: - direct `import lldb` integration - structured dict outputs for JSON mode - interactive REPL with persistent debug session +- a formal single-session DAP server for AI/editor debugging ## Architecture @@ -20,6 +23,7 @@ agent-harness/ └── cli_anything/ └── lldb/ ├── lldb_cli.py + ├── dap.py ├── core/ │ ├── session.py │ ├── breakpoints.py @@ -38,12 +42,13 @@ agent-harness/ - `--json`: machine-readable output - `--debug`: include traceback in errors +- `--session-file`: explicit persistent CLI session state path - `--version`: show package version ## Command Groups - `target`: create/show target -- `process`: launch/attach/continue/detach/info +- `process`: launch/attach/continue/interrupt/detach/info - `breakpoint`: set/list/delete/enable/disable - `thread`: list/select/backtrace/info - `frame`: select/info/locals @@ -51,8 +56,39 @@ agent-harness/ - `expr`: evaluate expression - `memory`: read/find - `core`: load core dump +- `dap`: run stdio DAP server +- `session`: info/close persistent CLI session - `repl`: interactive mode (default) +## Debug Adapter Protocol + +`cli-anything-lldb-dap` is a stdio DAP server. It owns one in-process +`LLDBSession` and does not use the persistent CLI daemon. Stdout must contain +only DAP `Content-Length` frames; diagnostics go to stderr or `--log-file`. + +Supported v1 requests: +- lifecycle: `initialize`, `launch`, `attach`, `configurationDone`, `disconnect` +- breakpoints: `setBreakpoints`, `setFunctionBreakpoints` +- inspection: `threads`, `stackTrace`, `scopes`, `variables`, `setVariable`, `evaluate`, `source`, `loadedSources`, `readMemory`, `modules`, `exceptionInfo`, `disassemble` +- execution: `continue`, `pause`, `next`, `stepIn`, `stepOut` + +DAP uses protocol-native pending breakpoint semantics: unresolved breakpoints +return `verified: false`, and later resolution is reported with breakpoint +events. +Variable references are adapter-local and reset on resume. This keeps stopped +frame state honest for AI agents and avoids reusing stale LLDB `SBValue` +objects after execution continues. + +Long-running GUI targets can provide DAP stop-rule profiles either with +`cli-anything-lldb-dap --profile PATH`, `cli-anything-lldb dap --profile PATH`, +or launch/attach arguments such as `stopRuleProfile` and inline `stopRules`. +Rules match structured stop context (`reason`, `module`, `function`, `regex`) +and either classify the stop or auto-continue it. Stopped events expose +`body.cliAnythingStop.origin` so clients can distinguish manual pauses, +debugger-internal traps, and ordinary debuggee stops. Profiles are loaded by the +current adapter process only; running DAP sessions must restart and re-attach or +re-launch before new code/profile contents take effect. + ## Patterns 1. **Lazy import of LLDB**: @@ -61,10 +97,20 @@ agent-harness/ `LLDBSession` owns debugger/target/process lifecycle. 3. **Dict-first API**: Core methods return JSON-serializable dict/list structures. -4. **Dual output mode**: +4. **Honest breakpoint state**: + Breakpoint payloads include `resolved` and `location_details`; CLI unresolved + breakpoints fail unless `--allow-pending` is explicit. +5. **Dual output mode**: `_output()` chooses JSON or human-friendly formatting. -5. **Boundary errors**: +6. **Boundary errors**: Command layer converts exceptions into structured error payloads. +7. **Secure persistent daemon**: + CLI session auth state is written under a per-user directory with restrictive + permissions and RPC dispatch uses an explicit method allowlist. +8. **Structured stop classification**: + DAP stop handling uses profile-driven rules instead of ad hoc substring + checks, while preserving `autoContinueInternalBreakpoints` as a compatibility + shortcut for common NVIDIA/Windows internal traps. ## Dependency Model diff --git a/lldb/agent-harness/LLDB.md b/lldb/agent-harness/LLDB.md index 41c3dcc71..1ffd6b329 100644 --- a/lldb/agent-harness/LLDB.md +++ b/lldb/agent-harness/LLDB.md @@ -29,6 +29,7 @@ This is implemented in `utils/lldb_backend.py`. All core operations return plain dictionaries: - process info (`pid`, `state`, `num_threads`) +- stop info (`reason`, `description`, `module`, `function`, `frame`) - frame info (`function`, `file`, `line`, `address`) - breakpoints (`id`, `locations`, `condition`) - expression result (`type`, `value`, `summary`, `error`) diff --git a/lldb/agent-harness/cli_anything/lldb/README.md b/lldb/agent-harness/cli_anything/lldb/README.md index 2c93d8894..e97fbe06d 100644 --- a/lldb/agent-harness/cli_anything/lldb/README.md +++ b/lldb/agent-harness/cli_anything/lldb/README.md @@ -2,6 +2,11 @@ Command-line interface for LLDB debugger using LLDB Python API. +The package exposes two agent-facing entry points: + +- `cli-anything-lldb`: JSON CLI / REPL workflows with a persistent session daemon +- `cli-anything-lldb-dap`: stdio Debug Adapter Protocol server for editor-style and AI debug clients + ## Installation ```bash @@ -42,11 +47,18 @@ cli-anything-lldb --json target create --exe /path/to/executable # Launch process cli-anything-lldb --json process launch --arg foo --arg bar +# Stop at process entry before user code +cli-anything-lldb --json process launch --stop-at-entry + # Set breakpoint by function cli-anything-lldb --json breakpoint set --function main +# Pending breakpoints are explicit +cli-anything-lldb --json breakpoint set --function PluginEntry --allow-pending + # Continue and inspect cli-anything-lldb --json process continue +cli-anything-lldb --json process interrupt cli-anything-lldb --json thread backtrace cli-anything-lldb --json frame locals @@ -63,7 +75,96 @@ cli-anything-lldb Non-REPL commands share a persistent LLDB session automatically, so commands such as `target create`, `breakpoint set`, `process launch`, and follow-up inspection commands can run as separate CLI invocations against the same live -debugger state. +debugger state. The default session state file lives in a per-user application +directory, not the global temp directory. Use `--session-file` or +`CLI_ANYTHING_LLDB_SESSION_FILE` when an agent needs an explicit session path, +and run `session close` when finished. + +By default, `breakpoint set` fails if LLDB creates a pending breakpoint with no +resolved locations. Use `--allow-pending` only when the target or symbols are +expected to load later. Breakpoint payloads include `resolved` and +`location_details` so agents can tell whether a stop is actually reachable. + +## Debug Adapter Protocol + +Run the formal stdio DAP server with: + +```bash +cli-anything-lldb-dap +cli-anything-lldb-dap --profile /path/to/stop-rules.json +``` + +or through the CLI convenience command: + +```bash +cli-anything-lldb dap +cli-anything-lldb dap --profile /path/to/stop-rules.json +``` + +The DAP server owns one in-process `LLDBSession` and writes only DAP frames to +stdout. Debuggee stdout/stderr is suppressed during DAP launches so protocol +messages are not corrupted. + +Supported requests include: + +- `initialize`, `launch`, `attach`, `configurationDone`, `disconnect` +- `setBreakpoints`, `setFunctionBreakpoints` +- `threads`, `stackTrace`, `scopes`, `variables`, `setVariable`, `evaluate` +- `continue`, `pause`, `next`, `stepIn`, `stepOut` +- `source`, `loadedSources`, `readMemory`, `modules`, `exceptionInfo`, `disassemble` + +DAP launch-time unresolved breakpoints are returned as `verified: false` and +updated with breakpoint events after launch if LLDB resolves them. +Variables support expandable child references for structs/classes/arrays, and +`setVariable` can update stopped-frame locals or child values when LLDB allows +the assignment. + +For long-running GUI targets, DAP `continue` responds before the blocking LLDB +`SBProcess.Continue()` call completes, then waits on a background thread for the +next stop. DAP `pause` uses `SBProcess.SendAsyncInterrupt()` so the adapter stays +responsive while the debuggee is running. If `setBreakpoints` or +`setFunctionBreakpoints` arrives during an active continue, the adapter first +requests an async interrupt, waits for the continue thread to observe a stopped +state, and only then mutates LLDB breakpoints. If the process does not stop in +time, the request fails clearly instead of hanging the DAP loop. + +`launch` and `attach` accept non-standard stop-rule controls for noisy GUI +debuggees: + +- `autoContinueInternalBreakpoints`: compatibility boolean that enables built-in + rules for NVIDIA `__jit_debug_register_code` / `jit-debug-register` and + Windows `Exception 0x80000003` at ``ntdll.dll`DbgBreakPoint``. +- `stopRules`: inline structured rules with optional `name`, `action` + (`stop` or `continue`), `origin`, `reason`, `module`, `function`, and `regex`. + Each rule must include at least one matcher, so a profile cannot accidentally + classify every stop. +- `stopRuleProfile` / `stopProfile` / `profile`: external JSON profile path + loaded for that launch/attach request. + +The DAP process also accepts `--profile` to load a base profile at adapter +startup. Profiles are JSON objects such as: + +```json +{ + "autoContinueInternalBreakpoints": true, + "stopRules": [ + { + "name": "c4d-nvidia-jit", + "action": "continue", + "origin": "internalTrap", + "module": "nvgpucomp64.dll", + "function": "__jit_debug_register_code" + } + ] +} +``` + +Every DAP `stopped` event includes `body.cliAnythingStop` with +`origin` (`manualPause`, `internalTrap`, or `debuggee`), LLDB stop reason, +module/function/frame metadata, and the matched rule when applicable. Running +`cli-anything-lldb-dap` processes do not hot-load code or profile changes; +restart the adapter and re-attach/re-launch the target for new rules to take +effect. The persistent session daemon now speaks a localhost JSON socket protocol and stores its session token in an owner-scoped state file. `memory find` scans in @@ -72,7 +173,7 @@ stores its session token in an owner-scoped state file. `memory find` scans in ## Command Groups - `target`: `create`, `info` -- `process`: `launch`, `attach`, `continue`, `detach`, `info` +- `process`: `launch`, `attach`, `continue`, `interrupt`, `detach`, `info` - `breakpoint`: `set`, `list`, `delete`, `enable`, `disable` - `thread`: `list`, `select`, `backtrace`, `info` - `frame`: `select`, `info`, `locals` @@ -81,6 +182,7 @@ stores its session token in an owner-scoped state file. `memory find` scans in - `memory`: `read`, `find` - `core`: `load` - `session`: `info`, `close` +- `dap` - `repl` ## JSON Output @@ -97,6 +199,7 @@ cli-anything-lldb --json process info cd lldb/agent-harness pytest cli_anything/lldb/tests/test_core.py -v pytest cli_anything/lldb/tests/test_full_e2e.py -v +pytest cli_anything/lldb/tests -q ``` E2E tests require: diff --git a/lldb/agent-harness/cli_anything/lldb/__init__.py b/lldb/agent-harness/cli_anything/lldb/__init__.py index ebd6bf23f..b2e426b50 100644 --- a/lldb/agent-harness/cli_anything/lldb/__init__.py +++ b/lldb/agent-harness/cli_anything/lldb/__init__.py @@ -1,6 +1,6 @@ """LLDB CLI harness - command-line interface for LLDB debugger.""" -__version__ = "0.1.0" +__version__ = "1.0.0" # The ``lldb`` module is loaded lazily by backend/core code when needed. # Install LLDB and ensure its Python bindings are discoverable. diff --git a/lldb/agent-harness/cli_anything/lldb/core/breakpoints.py b/lldb/agent-harness/cli_anything/lldb/core/breakpoints.py index 935938470..b4aba468a 100644 --- a/lldb/agent-harness/cli_anything/lldb/core/breakpoints.py +++ b/lldb/agent-harness/cli_anything/lldb/core/breakpoints.py @@ -7,8 +7,21 @@ from typing import Any, Dict, Optional -def set_breakpoint(session, file: Optional[str] = None, line: Optional[int] = None, function: Optional[str] = None, condition: Optional[str] = None) -> Dict[str, Any]: - return session.breakpoint_set(file=file, line=line, function=function, condition=condition) +def set_breakpoint( + session, + file: Optional[str] = None, + line: Optional[int] = None, + function: Optional[str] = None, + condition: Optional[str] = None, + allow_pending: bool = False, +) -> Dict[str, Any]: + return session.breakpoint_set( + file=file, + line=line, + function=function, + condition=condition, + allow_pending=allow_pending, + ) def list_breakpoints(session) -> Dict[str, Any]: diff --git a/lldb/agent-harness/cli_anything/lldb/core/session.py b/lldb/agent-harness/cli_anything/lldb/core/session.py index e6c304f82..f0619353c 100644 --- a/lldb/agent-harness/cli_anything/lldb/core/session.py +++ b/lldb/agent-harness/cli_anything/lldb/core/session.py @@ -90,9 +90,23 @@ def launch( args: Optional[List[str]] = None, env: Optional[List[str]] = None, working_dir: Optional[str] = None, + stop_at_entry: bool = False, + suppress_stdio: bool = False, ) -> Dict[str, Any]: self._require_target() - self.process = self.target.LaunchSimple(args, env, working_dir or os.getcwd()) + error = self._lldb.SBError() + launch_info = self._lldb.SBLaunchInfo(args or []) + launch_info.SetWorkingDirectory(working_dir or os.getcwd()) + if env: + launch_info.SetEnvironmentEntries(env, True) + if stop_at_entry: + launch_info.SetLaunchFlags(self._lldb.eLaunchFlagStopAtEntry) + if suppress_stdio: + launch_info.AddSuppressFileAction(1, False, True) + launch_info.AddSuppressFileAction(2, False, True) + self.process = self.target.Launch(launch_info, error) + if not error.Success(): + raise RuntimeError(f"Launch failed: {error}") if not self.process or not self.process.IsValid(): raise RuntimeError("Launch failed") self._process_origin = "launched" @@ -113,6 +127,7 @@ def breakpoint_set( line: Optional[int] = None, function: Optional[str] = None, condition: Optional[str] = None, + allow_pending: bool = False, ) -> Dict[str, Any]: self._require_target() if function: @@ -125,26 +140,22 @@ def breakpoint_set( raise RuntimeError("Failed to create breakpoint") if condition: bp.SetCondition(condition) - return { - "id": bp.GetID(), - "locations": bp.GetNumLocations(), - "condition": condition, - } + details = self._breakpoint_payload(bp) + if not details["resolved"] and not allow_pending: + bp_id = bp.GetID() + self.target.BreakpointDelete(bp_id) + raise RuntimeError( + "Breakpoint is unresolved. Pass allow_pending=True or use " + "the CLI --allow-pending flag if a pending breakpoint is intended." + ) + return details def breakpoint_list(self) -> Dict[str, Any]: self._require_target() bps = [] for i in range(self.target.GetNumBreakpoints()): bp = self.target.GetBreakpointAtIndex(i) - bps.append( - { - "id": bp.GetID(), - "hits": bp.GetHitCount(), - "locations": bp.GetNumLocations(), - "enabled": bp.IsEnabled(), - "condition": bp.GetCondition() or None, - } - ) + bps.append(self._breakpoint_payload(bp)) return {"breakpoints": bps} def breakpoint_delete(self, bp_id: int) -> Dict[str, Any]: @@ -176,12 +187,25 @@ def step_out(self) -> Dict[str, Any]: def continue_exec(self) -> Dict[str, Any]: self._require_process() - self.process.Continue() - return self.process_info() + error = self.process.Continue() + if error is not None and not error.Success(): + raise RuntimeError(f"Continue failed: {error}") + return self._process_info() - def process_info(self) -> Dict[str, Any]: + def interrupt(self) -> Dict[str, Any]: + self._require_process() + error = self.process.Stop() + if error is not None and not error.Success(): + raise RuntimeError(f"Interrupt failed: {error}") return self._process_info() + def interrupt_async(self) -> Dict[str, Any]: + self._require_process() + error = self.process.SendAsyncInterrupt() + if error is not None and not error.Success(): + raise RuntimeError(f"Async interrupt failed: {error}") + return {"status": "interrupt_requested"} + def backtrace(self, limit: int = 50) -> Dict[str, Any]: thread = self._current_thread() frames = [] @@ -197,7 +221,7 @@ def backtrace(self, limit: int = 50) -> Dict[str, Any]: "address": hex(f.GetPC()), } ) - return {"thread_id": thread.GetThreadID(), "frames": frames} + return {"thread_id": thread.GetThreadID(), "frames": frames, "total_frames": thread.GetNumFrames()} def locals(self) -> Dict[str, Any]: frame = self._current_frame() @@ -210,10 +234,40 @@ def locals(self) -> Dict[str, Any]: "type": v.GetTypeName(), "value": v.GetValue(), "summary": v.GetSummary(), + "num_children": v.GetNumChildren(), } ) return {"variables": result} + def local_values(self): + """Return raw SBValue locals for in-process adapters such as DAP.""" + frame = self._current_frame() + variables = frame.GetVariables(True, True, False, True) + return [variables.GetValueAtIndex(i) for i in range(variables.GetSize())] + + def set_local_variable(self, thread_id: int, frame_index: int, name: str, value: str): + self.thread_select(thread_id) + self.frame_select(frame_index) + frame = self._current_frame() + variable = frame.FindVariable(name) + if not variable or not variable.IsValid(): + raise RuntimeError(f"Variable not found: {name}") + self._set_value(variable, value) + return variable + + def set_child_value(self, parent, name: str, value: str): + child = parent.GetChildMemberWithName(name) + if not child or not child.IsValid(): + for index in range(parent.GetNumChildren()): + candidate = parent.GetChildAtIndex(index) + if candidate.GetName() == name: + child = candidate + break + if not child or not child.IsValid(): + raise RuntimeError(f"Child variable not found: {name}") + self._set_value(child, value) + return child + def evaluate(self, expr: str) -> Dict[str, Any]: frame = self._current_frame() val = frame.EvaluateExpression(expr) @@ -339,6 +393,63 @@ def find_memory( "max_scan_size": max_scan_size, } + def disassemble(self, address: int, count: int = 8) -> Dict[str, Any]: + self._require_target() + sb_address = self.target.ResolveLoadAddress(address) + if not sb_address or not sb_address.IsValid(): + raise RuntimeError(f"Could not resolve address: {hex(address)}") + instructions = self.target.ReadInstructions(sb_address, max(1, count)) + result = [] + for i in range(instructions.GetSize()): + inst = instructions.GetInstructionAtIndex(i) + stream = self._lldb.SBStream() + inst.GetDescription(stream) + inst_address = inst.GetAddress().GetLoadAddress(self.target) + result.append( + { + "address": hex(inst_address), + "instruction": stream.GetData().strip(), + } + ) + return {"instructions": result} + + def loaded_sources(self) -> Dict[str, Any]: + self._require_target() + seen = set() + sources = [] + for module_index in range(self.target.GetNumModules()): + module = self.target.GetModuleAtIndex(module_index) + for unit_index in range(module.GetNumCompileUnits()): + unit = module.GetCompileUnitAtIndex(unit_index) + file_spec = unit.GetFileSpec() + path = self._filespec_path(file_spec) + if not path or path in seen: + continue + seen.add(path) + sources.append({"name": os.path.basename(path), "path": path}) + return {"sources": sources} + + def modules(self) -> Dict[str, Any]: + self._require_target() + modules = [] + for index in range(self.target.GetNumModules()): + module = self.target.GetModuleAtIndex(index) + file_spec = module.GetFileSpec() + path = self._filespec_path(file_spec) + header_addr = module.GetObjectFileHeaderAddress() + load_addr = header_addr.GetLoadAddress(self.target) if header_addr and header_addr.IsValid() else None + modules.append( + { + "id": index + 1, + "name": os.path.basename(path) if path else str(file_spec), + "path": path, + "symbol_status": "loaded" if module.GetNumCompileUnits() > 0 else "unknown", + "address": hex(load_addr) if load_addr and load_addr != self._lldb.LLDB_INVALID_ADDRESS else None, + "version": module.GetVersion(), + } + ) + return {"modules": modules} + def load_core(self, core_path: str) -> Dict[str, Any]: self._require_target() self.process = self.target.LoadCore(core_path) @@ -376,6 +487,9 @@ def session_status(self) -> Dict[str, Any]: "process_origin": self._process_origin if has_process else None, } + def process_info(self) -> Dict[str, Any]: + return self._process_info() + def _require_target(self): if self.target is None or not self.target.IsValid(): raise RuntimeError("No target. Create target first.") @@ -409,10 +523,15 @@ def _current_frame(self): def _process_info(self) -> Dict[str, Any]: self._require_process() state = self.process.GetState() + selected = self.process.GetSelectedThread() + selected_thread_id = selected.GetThreadID() if selected and selected.IsValid() else None return { "pid": self.process.GetProcessID(), "state": self._STATE_NAMES.get(state, str(state)), "num_threads": self.process.GetNumThreads(), + "selected_thread_id": selected_thread_id, + "stop": self._stop_info(selected) if selected_thread_id is not None else None, + "exit_status": self.process.GetExitStatus(), } def _frame_info(self) -> Dict[str, Any]: @@ -424,3 +543,120 @@ def _frame_info(self) -> Dict[str, Any]: "line": line_entry.GetLine() if line_entry.IsValid() else None, "address": hex(f.GetPC()), } + + def _breakpoint_payload(self, bp) -> Dict[str, Any]: + locations = self._breakpoint_locations(bp) + return { + "id": bp.GetID(), + "hits": bp.GetHitCount(), + "locations": len(locations), + "resolved": len(locations) > 0, + "location_details": locations, + "enabled": bp.IsEnabled(), + "condition": bp.GetCondition() or None, + } + + def _breakpoint_locations(self, bp) -> List[Dict[str, Any]]: + result = [] + for i in range(bp.GetNumLocations()): + loc = bp.GetLocationAtIndex(i) + address = loc.GetAddress() + line_entry = address.GetLineEntry() + load_addr = address.GetLoadAddress(self.target) + function = address.GetFunction() + result.append( + { + "id": loc.GetID(), + "address": hex(load_addr) if load_addr != self._lldb.LLDB_INVALID_ADDRESS else None, + "file": str(line_entry.GetFileSpec()) if line_entry.IsValid() else None, + "line": line_entry.GetLine() if line_entry.IsValid() else None, + "column": line_entry.GetColumn() if line_entry.IsValid() else None, + "function": function.GetName() if function and function.IsValid() else None, + "enabled": loc.IsEnabled(), + "hit_count": loc.GetHitCount(), + } + ) + return result + + def _stop_info(self, thread) -> Dict[str, Any]: + if thread is None or not thread.IsValid(): + return {"reason": None, "description": None, "hit_breakpoint_ids": [], "frame": None} + + reason = thread.GetStopReason() + reason_name = self._stop_reason_name(reason) + frame = self._thread_frame_summary(thread) + return { + "reason": reason_name, + "description": self._thread_stop_description(thread), + "hit_breakpoint_ids": self._hit_breakpoint_ids(thread) if reason_name == "breakpoint" else [], + "frame": frame, + "module": frame.get("module") if frame else None, + "function": frame.get("function") if frame else None, + } + + def _stop_reason_name(self, reason: int) -> str | None: + lldb = self._lldb + mapping = { + getattr(lldb, "eStopReasonBreakpoint", object()): "breakpoint", + getattr(lldb, "eStopReasonWatchpoint", object()): "watchpoint", + getattr(lldb, "eStopReasonSignal", object()): "signal", + getattr(lldb, "eStopReasonException", object()): "exception", + getattr(lldb, "eStopReasonTrace", object()): "step", + getattr(lldb, "eStopReasonPlanComplete", object()): "step", + getattr(lldb, "eStopReasonExec", object()): "entry", + getattr(lldb, "eStopReasonThreadExiting", object()): "thread-exiting", + getattr(lldb, "eStopReasonNone", object()): None, + getattr(lldb, "eStopReasonInvalid", object()): None, + } + return mapping.get(reason, str(reason)) + + def _thread_stop_description(self, thread) -> str | None: + stream = self._lldb.SBStream() + thread.GetStatus(stream) + text = stream.GetData().strip() + return text or None + + def _thread_frame_summary(self, thread) -> Dict[str, Any] | None: + frame = thread.GetSelectedFrame() + if not frame or not frame.IsValid(): + if thread.GetNumFrames() <= 0: + return None + frame = thread.GetFrameAtIndex(0) + line_entry = frame.GetLineEntry() + module = frame.GetModule() + module_path = self._filespec_path(module.GetFileSpec()) if module and module.IsValid() else None + return { + "function": frame.GetFunctionName(), + "module": os.path.basename(module_path) if module_path else None, + "module_path": module_path, + "file": str(line_entry.GetFileSpec()) if line_entry.IsValid() else None, + "line": line_entry.GetLine() if line_entry.IsValid() else None, + "address": hex(frame.GetPC()), + } + + def _hit_breakpoint_ids(self, thread) -> List[int]: + ids = [] + data_count = thread.GetStopReasonDataCount() + for index in range(0, data_count, 2): + bp_id = thread.GetStopReasonDataAtIndex(index) + if bp_id: + ids.append(int(bp_id)) + return ids + + def _filespec_path(self, file_spec) -> str | None: + if not file_spec or not file_spec.IsValid(): + return None + directory = file_spec.GetDirectory() + filename = file_spec.GetFilename() + if directory and filename: + return os.path.normpath(os.path.join(directory, filename)) + if filename: + return os.path.normpath(filename) + text = str(file_spec) + return os.path.normpath(text) if text else None + + def _set_value(self, variable, value: str): + error = self._lldb.SBError() + ok = variable.SetValueFromCString(value, error) + if not ok or not error.Success(): + raise RuntimeError(f"Set variable failed: {error}") diff --git a/lldb/agent-harness/cli_anything/lldb/dap.py b/lldb/agent-harness/cli_anything/lldb/dap.py new file mode 100644 index 000000000..93e312682 --- /dev/null +++ b/lldb/agent-harness/cli_anything/lldb/dap.py @@ -0,0 +1,992 @@ +"""Minimal Debug Adapter Protocol server backed by LLDBSession.""" + +from __future__ import annotations + +import argparse +import base64 +import json +import os +import re +import shlex +import sys +import threading +from dataclasses import dataclass +from pathlib import Path +from typing import Any, BinaryIO, Callable + +from cli_anything.lldb.core.session import LLDBSession + + +class DAPProtocolError(RuntimeError): + """Raised when a DAP frame cannot be parsed.""" + + +def encode_message(payload: dict[str, Any]) -> bytes: + body = json.dumps(payload, separators=(",", ":"), default=str).encode("utf-8") + header = f"Content-Length: {len(body)}\r\n\r\n".encode("ascii") + return header + body + + +def read_message(stream: BinaryIO) -> dict[str, Any] | None: + content_length: int | None = None + saw_header = False + + while True: + line = stream.readline() + if line == b"": + return None if not saw_header else _raise_protocol_error("Unexpected EOF in DAP header") + saw_header = True + stripped = line.strip() + if not stripped: + break + name, sep, value = stripped.partition(b":") + if not sep: + raise DAPProtocolError(f"Malformed DAP header: {stripped!r}") + if name.lower() == b"content-length": + try: + content_length = int(value.strip()) + except ValueError as exc: + raise DAPProtocolError(f"Invalid Content-Length: {value!r}") from exc + + if content_length is None: + raise DAPProtocolError("Missing Content-Length header") + + body = stream.read(content_length) + if len(body) != content_length: + raise DAPProtocolError("Unexpected EOF in DAP body") + try: + payload = json.loads(body.decode("utf-8")) + except json.JSONDecodeError as exc: + raise DAPProtocolError(f"Invalid DAP JSON: {exc}") from exc + if not isinstance(payload, dict): + raise DAPProtocolError("DAP payload must be a JSON object") + return payload + + +def _raise_protocol_error(message: str): + raise DAPProtocolError(message) + + +@dataclass(frozen=True) +class StopRule: + """Structured rule used to classify or auto-continue debugger stops.""" + + name: str + action: str = "stop" + origin: str = "internalTrap" + reason: str | None = None + module: str | None = None + function: str | None = None + regex: str | None = None + source: str | None = None + + @classmethod + def from_mapping(cls, raw: dict[str, Any], *, source: str) -> "StopRule": + if not isinstance(raw, dict): + raise RuntimeError("stopRules entries must be objects") + name = str(raw.get("name") or raw.get("id") or "unnamed-stop-rule") + action = str(raw.get("action") or "stop") + if action not in {"stop", "continue"}: + raise RuntimeError(f"Unsupported stop rule action for {name}: {action}") + regex = raw.get("regex") + if regex is not None: + try: + re.compile(str(regex)) + except re.error as exc: + raise RuntimeError(f"Invalid stop rule regex for {name}: {exc}") from exc + if not any(raw.get(key) is not None for key in ("reason", "module", "function", "regex")): + raise RuntimeError(f"Stop rule {name} must include reason, module, function, or regex") + return cls( + name=name, + action=action, + origin=str(raw.get("origin") or "internalTrap"), + reason=str(raw["reason"]) if raw.get("reason") is not None else None, + module=str(raw["module"]) if raw.get("module") is not None else None, + function=str(raw["function"]) if raw.get("function") is not None else None, + regex=str(regex) if regex is not None else None, + source=source, + ) + + def matches(self, stop_context: dict[str, Any]) -> bool: + if self.reason and not _stop_field_matches(self.reason, [stop_context.get("reason"), stop_context.get("lldbReason")]): + return False + if self.module and not _stop_field_matches( + self.module, + [stop_context.get("module"), stop_context.get("modulePath")], + allow_basename=True, + ): + return False + if self.function and not _stop_field_matches( + self.function, + [stop_context.get("function")], + allow_symbol_suffix=True, + ): + return False + if self.regex and not re.search(self.regex, _stop_context_text(stop_context), re.IGNORECASE): + return False + return True + + def to_dap(self) -> dict[str, Any]: + return { + "name": self.name, + "action": self.action, + "origin": self.origin, + "source": self.source, + } + + +def _stop_field_matches( + expected: str, + values: list[Any], + *, + allow_basename: bool = False, + allow_symbol_suffix: bool = False, +) -> bool: + expected_norm = expected.casefold() + for value in values: + if value is None: + continue + text = str(value) + candidates = [text.casefold()] + if allow_basename: + candidates.append(Path(text).name.casefold()) + for candidate in candidates: + if candidate == expected_norm: + return True + if allow_symbol_suffix and ( + candidate.endswith(f"::{expected_norm}") or candidate.endswith(f"`{expected_norm}") + ): + return True + return False + + +def _stop_context_text(stop_context: dict[str, Any]) -> str: + fields = [ + stop_context.get("reason"), + stop_context.get("lldbReason"), + stop_context.get("description"), + stop_context.get("module"), + stop_context.get("modulePath"), + stop_context.get("function"), + ] + frame = stop_context.get("frame") + if isinstance(frame, dict): + fields.extend(frame.get(key) for key in ("module", "module_path", "function", "file", "address")) + return "\n".join(str(field) for field in fields if field) + + +class LLDBDebugAdapter: + """Single-session stdio DAP adapter for LLDB.""" + + def __init__( + self, + session_factory: Callable[[], LLDBSession] = LLDBSession, + log_file: str | None = None, + profile_file: str | None = None, + ): + self._session_factory = session_factory + self._session: LLDBSession | None = None + self._out: BinaryIO | None = None + self._seq = 1 + self._pending_launch: dict[str, Any] | None = None + self._pending_attach: dict[str, Any] | None = None + self._source_breakpoints: dict[str, list[int]] = {} + self._function_breakpoints: list[int] = [] + self._frame_refs: dict[int, tuple[int, int]] = {} + self._variable_refs: dict[int, dict[str, Any]] = {} + self._next_ref = 1 + self._log_file = Path(log_file).expanduser() if log_file else None + self._protocol_lock = threading.Lock() + self._lldb_api_lock = threading.RLock() + self._continue_state = threading.Condition() + self._continue_active = False + self._auto_continue_internal_breakpoints = False + self._base_auto_continue_internal_breakpoints = False + self._base_stop_rules: list[StopRule] = [] + self._active_stop_rules: list[StopRule] = [] + self._pause_requested = False + self._mutation_stop_timeout = 10.0 + if profile_file: + self._base_stop_rules, self._base_auto_continue_internal_breakpoints = self._load_stop_profile_file( + profile_file + ) + self._active_stop_rules = list(self._base_stop_rules) + + def run(self, instream: BinaryIO | None = None, outstream: BinaryIO | None = None) -> int: + instream = instream or sys.stdin.buffer + outstream = outstream or sys.stdout.buffer + self._out = outstream + try: + while True: + try: + message = read_message(instream) + except DAPProtocolError as exc: + self._log(f"DAP protocol error: {exc}") + return 1 + if message is None: + return 0 + self.handle_message(message) + finally: + self._cleanup_session() + + def handle_message(self, message: dict[str, Any]): + if message.get("type") != "request": + return + + request_seq = int(message.get("seq", 0)) + command = str(message.get("command") or "") + args = message.get("arguments") or {} + handler = getattr(self, f"_handle_{command}", None) + if handler is None: + self._send_response(request_seq, command, success=False, message=f"Unsupported request: {command}") + return + + try: + body, post_send = handler(args) + except Exception as exc: + self._log(f"{command} failed: {exc}") + self._send_response(request_seq, command, success=False, message=str(exc)) + return + + self._send_response(request_seq, command, body=body) + if post_send: + try: + post_send() + except Exception as exc: + self._log(f"{command} post-response failed: {exc}") + self._send_event( + "output", + {"category": "stderr", "output": f"{command} failed after response: {exc}\n"}, + ) + self._send_event("terminated") + + def _handle_initialize(self, _args: dict[str, Any]): + capabilities = { + "supportsConfigurationDoneRequest": True, + "supportsFunctionBreakpoints": True, + "supportsEvaluateForHovers": True, + "supportsDisassembleRequest": True, + "supportsLoadedSourcesRequest": True, + "supportsReadMemoryRequest": True, + "supportsSetVariable": True, + "supportsModulesRequest": True, + "supportsExceptionInfoRequest": True, + "supportsSteppingGranularity": False, + "supportTerminateDebuggee": True, + } + return capabilities, lambda: self._send_event("initialized") + + def _handle_launch(self, args: dict[str, Any]): + program = args.get("program") or args.get("executable") + if not program: + raise RuntimeError("launch requires 'program'") + self._ensure_session().target_create(str(program), arch=args.get("arch")) + self._pending_launch = { + "args": self._coerce_args(args.get("args")), + "env": self._coerce_env(args.get("env")), + "working_dir": args.get("cwd") or args.get("workingDirectory"), + "stop_at_entry": bool(args.get("stopOnEntry", False)), + "suppress_stdio": True, + } + self._configure_stop_rules(args) + self._pending_attach = None + return {}, None + + def _handle_attach(self, args: dict[str, Any]): + program = args.get("program") or args.get("executable") + if not program: + raise RuntimeError("attach requires 'program' or 'executable' so LLDB can create a target") + self._ensure_session().target_create(str(program), arch=args.get("arch")) + pid = args.get("pid", args.get("processId")) + name = args.get("name", args.get("processName")) + if pid is None and not name: + raise RuntimeError("attach requires pid/processId or name/processName") + self._pending_attach = { + "pid": int(pid) if pid is not None else None, + "name": str(name) if name else None, + "wait_for": bool(args.get("waitFor", False)), + } + self._configure_stop_rules(args) + self._pending_launch = None + return {}, None + + def _handle_configurationDone(self, _args: dict[str, Any]): + def post_send(): + default_reason = None + if self._pending_launch is not None: + launch_args = self._pending_launch + self._pending_launch = None + self._ensure_session().launch(**launch_args) + self._emit_breakpoint_updates() + default_reason = "entry" if launch_args.get("stop_at_entry") else "breakpoint" + elif self._pending_attach is not None: + attach_args = self._pending_attach + self._pending_attach = None + if attach_args["pid"] is not None: + self._ensure_session().attach_pid(attach_args["pid"]) + else: + self._ensure_session().attach_name(attach_args["name"], wait_for=attach_args["wait_for"]) + default_reason = "pause" + self._emit_execution_event(default_reason=default_reason) + + return {}, post_send + + def _handle_disconnect(self, args: dict[str, Any]): + terminate_debuggee = bool(args.get("terminateDebuggee", True)) + + def post_send(): + if self._session is not None: + if not terminate_debuggee and self._session.session_status().get("process_origin") == "launched": + try: + self._session.detach() + except Exception: + pass + self._session.destroy() + self._session = None + self._send_event("terminated") + + return {}, post_send + + def _handle_setBreakpoints(self, args: dict[str, Any]): + source = args.get("source") or {} + path = source.get("path") + if not path: + raise RuntimeError("setBreakpoints requires source.path") + + self._ensure_stopped_for_target_mutation("setBreakpoints") + session = self._ensure_session() + source_key = str(Path(path)) + with self._lldb_api_lock: + for bp_id in self._source_breakpoints.get(source_key, []): + try: + session.breakpoint_delete(bp_id) + except Exception: + pass + + dap_breakpoints = [] + created_ids = [] + for item in args.get("breakpoints") or []: + line = int(item.get("line")) + payload = session.breakpoint_set( + file=source_key, + line=line, + condition=item.get("condition"), + allow_pending=True, + ) + created_ids.append(payload["id"]) + dap_breakpoints.append(self._to_dap_breakpoint(payload, source_key, requested_line=line)) + + self._source_breakpoints[source_key] = created_ids + return {"breakpoints": dap_breakpoints}, None + + def _handle_setFunctionBreakpoints(self, args: dict[str, Any]): + self._ensure_stopped_for_target_mutation("setFunctionBreakpoints") + session = self._ensure_session() + with self._lldb_api_lock: + for bp_id in self._function_breakpoints: + try: + session.breakpoint_delete(bp_id) + except Exception: + pass + + self._function_breakpoints = [] + result = [] + for item in args.get("breakpoints") or []: + name = item.get("name") + if not name: + continue + payload = session.breakpoint_set( + function=str(name), + condition=item.get("condition"), + allow_pending=True, + ) + self._function_breakpoints.append(payload["id"]) + result.append(self._to_dap_breakpoint(payload)) + return {"breakpoints": result}, None + + def _handle_threads(self, _args: dict[str, Any]): + threads = [] + for item in self._ensure_session().threads().get("threads", []): + name = item.get("name") or f"Thread {item.get('id')}" + threads.append({"id": item["id"], "name": name}) + return {"threads": threads}, None + + def _handle_stackTrace(self, args: dict[str, Any]): + thread_id = int(args.get("threadId")) + start = int(args.get("startFrame", 0)) + levels = int(args.get("levels", 50) or 50) + session = self._ensure_session() + session.thread_select(thread_id) + backtrace = session.backtrace(limit=start + levels) + frames = [] + for frame in backtrace.get("frames", [])[start : start + levels]: + frame_id = self._alloc_frame_ref(thread_id, int(frame["index"])) + source = self._source_from_path(frame.get("file")) + frames.append( + { + "id": frame_id, + "name": frame.get("function") or "", + "source": source, + "line": frame.get("line") or 0, + "column": 0, + "instructionPointerReference": frame.get("address"), + } + ) + return {"stackFrames": frames, "totalFrames": backtrace.get("total_frames", len(frames))}, None + + def _handle_scopes(self, args: dict[str, Any]): + frame_id = int(args.get("frameId")) + if frame_id not in self._frame_refs: + raise RuntimeError(f"Unknown frameId: {frame_id}") + ref = self._alloc_variable_ref({"kind": "locals", "frame_ref": frame_id}) + return {"scopes": [{"name": "Locals", "variablesReference": ref, "expensive": False}]}, None + + def _handle_variables(self, args: dict[str, Any]): + ref = int(args.get("variablesReference")) + entry = self._variable_refs.get(ref) + if not entry: + return {"variables": []}, None + if entry["kind"] != "locals": + if entry["kind"] == "children": + return {"variables": self._dap_variables_from_values(self._child_values(entry["value"]))}, None + return {"variables": []}, None + + thread_id, frame_index = self._frame_refs[entry["frame_ref"]] + session = self._ensure_session() + session.thread_select(thread_id) + session.frame_select(frame_index) + return {"variables": self._dap_variables_from_values(session.local_values())}, None + + def _handle_setVariable(self, args: dict[str, Any]): + ref = int(args.get("variablesReference")) + name = str(args.get("name") or "") + value = str(args.get("value") or "") + entry = self._variable_refs.get(ref) + if not entry: + raise RuntimeError(f"Unknown variablesReference: {ref}") + + if entry["kind"] == "locals": + thread_id, frame_index = self._frame_refs[entry["frame_ref"]] + updated = self._ensure_session().set_local_variable(thread_id, frame_index, name, value) + elif entry["kind"] == "children": + updated = self._ensure_session().set_child_value(entry["value"], name, value) + else: + raise RuntimeError(f"Cannot set variable for reference kind: {entry['kind']}") + + return self._dap_variable_from_value(updated), None + + def _handle_evaluate(self, args: dict[str, Any]): + expression = args.get("expression") + if not expression: + raise RuntimeError("evaluate requires expression") + frame_id = args.get("frameId") + if frame_id is not None and int(frame_id) in self._frame_refs: + thread_id, frame_index = self._frame_refs[int(frame_id)] + self._ensure_session().thread_select(thread_id) + self._ensure_session().frame_select(frame_index) + payload = self._ensure_session().evaluate(str(expression)) + if payload.get("error"): + raise RuntimeError(payload["error"]) + result = payload.get("value") or payload.get("summary") or "" + return {"result": result, "type": payload.get("type"), "variablesReference": 0}, None + + def _handle_continue(self, _args: dict[str, Any]): + def post_send(): + self._reset_refs_for_resume() + self._send_continued_event() + self._start_continue_thread( + name="cli-anything-lldb-dap-continue", + default_reason="breakpoint", + ) + + return {"allThreadsContinued": True}, post_send + + def _handle_pause(self, _args: dict[str, Any]): + def post_send(): + self._pause_requested = True + self._request_async_interrupt() + if not self._is_continue_active(): + with self._lldb_api_lock: + self._emit_execution_event(default_reason="pause") + + return {}, post_send + + def _handle_next(self, _args: dict[str, Any]): + return {}, self._step_post_send(self._ensure_session().step_over) + + def _handle_stepIn(self, _args: dict[str, Any]): + return {}, self._step_post_send(self._ensure_session().step_into) + + def _handle_stepOut(self, _args: dict[str, Any]): + return {}, self._step_post_send(self._ensure_session().step_out) + + def _handle_source(self, args: dict[str, Any]): + source = args.get("source") or {} + path = source.get("path") + if not path: + raise RuntimeError("source request requires source.path") + text = Path(path).read_text(encoding="utf-8", errors="replace") + return {"content": text, "mimeType": "text/plain"}, None + + def _handle_loadedSources(self, _args: dict[str, Any]): + sources = self._ensure_session().loaded_sources().get("sources", []) + return {"sources": [self._source_from_path(item.get("path")) for item in sources if item.get("path")]}, None + + def _handle_modules(self, _args: dict[str, Any]): + modules = [] + for item in self._ensure_session().modules().get("modules", []): + modules.append( + { + "id": item["id"], + "name": item.get("name") or "", + "path": item.get("path"), + "isOptimized": False, + "isUserCode": True, + "symbolStatus": item.get("symbol_status"), + "addressRange": item.get("address"), + "version": item.get("version"), + } + ) + return {"modules": modules}, None + + def _handle_exceptionInfo(self, _args: dict[str, Any]): + info = self._ensure_session().process_info() + stop = info.get("stop") or {} + reason = stop.get("reason") or "unknown" + description = stop.get("description") or reason + return { + "exceptionId": reason, + "breakMode": "always", + "description": description, + "details": {"message": description}, + }, None + + def _handle_readMemory(self, args: dict[str, Any]): + address = self._parse_address(str(args.get("memoryReference") or "0")) + address += int(args.get("offset", 0) or 0) + count = int(args.get("count", 0) or 0) + if count <= 0: + raise RuntimeError("readMemory requires a positive count") + payload = self._ensure_session().read_memory(address, count) + data = bytes.fromhex(payload["hex"]) + return { + "address": hex(address), + "data": base64.b64encode(data).decode("ascii"), + }, None + + def _handle_disassemble(self, args: dict[str, Any]): + address = self._parse_address(str(args.get("memoryReference") or "0")) + address += int(args.get("instructionOffset", 0) or 0) + count = int(args.get("instructionCount", 8) or 8) + payload = self._ensure_session().disassemble(address, count=count) + instructions = [ + {"address": item["address"], "instruction": item["instruction"]} + for item in payload.get("instructions", []) + ] + return {"instructions": instructions}, None + + def _step_post_send(self, step_fn: Callable[[], dict[str, Any]]): + def post_send(): + self._reset_refs_for_resume() + self._send_continued_event() + with self._lldb_api_lock: + step_fn() + self._emit_execution_event(default_reason="step") + + return post_send + + def _start_continue_thread(self, *, name: str, default_reason: str): + with self._continue_state: + if self._continue_active: + self._log("continue requested while a continue operation is already active") + return + self._continue_active = True + threading.Thread( + target=self._continue_until_stop, + kwargs={"default_reason": default_reason}, + name=name, + daemon=True, + ).start() + + def _continue_until_stop(self, *, default_reason: str): + try: + self._ensure_session().continue_exec() + except Exception as exc: + self._log(f"continue failed: {exc}") + self._send_event("output", {"category": "stderr", "output": f"continue failed: {exc}\n"}) + self._send_event("terminated") + return + finally: + self._mark_continue_inactive() + + with self._lldb_api_lock: + self._emit_breakpoint_updates() + self._emit_execution_event(default_reason=default_reason) + + def _mark_continue_inactive(self): + with self._continue_state: + self._continue_active = False + self._continue_state.notify_all() + + def _is_continue_active(self) -> bool: + with self._continue_state: + return self._continue_active + + def _ensure_stopped_for_target_mutation(self, operation: str): + if not self._is_continue_active(): + return + self._log(f"{operation}: interrupting running debuggee before target mutation") + self._request_async_interrupt() + with self._continue_state: + stopped = self._continue_state.wait_for( + lambda: not self._continue_active, + timeout=self._mutation_stop_timeout, + ) + if not stopped: + raise RuntimeError( + f"Timed out waiting for debuggee to stop before {operation}. " + "Send a pause request and retry after the stopped event." + ) + + def _request_async_interrupt(self): + session = self._ensure_session() + interrupt = getattr(session, "interrupt_async", None) + if interrupt is not None: + return interrupt() + return session.interrupt() + + def _configure_stop_rules(self, args: dict[str, Any]): + rules = list(self._base_stop_rules) + auto_continue = self._base_auto_continue_internal_breakpoints or bool( + args.get("autoContinueInternalBreakpoints", False) + ) + profile_path = args.get("stopRuleProfile") or args.get("stopProfile") or args.get("profile") + if profile_path: + profile_rules, profile_auto_continue = self._load_stop_profile_file(str(profile_path)) + rules.extend(profile_rules) + auto_continue = auto_continue or profile_auto_continue + inline_rules = args.get("stopRules") + if inline_rules: + rules.extend(self._coerce_stop_rules(inline_rules, source="dap-arguments")) + if auto_continue: + rules.extend(self._builtin_internal_stop_rules()) + self._auto_continue_internal_breakpoints = auto_continue + self._active_stop_rules = rules + + def _load_stop_profile_file(self, profile_file: str) -> tuple[list[StopRule], bool]: + profile_path = Path(profile_file).expanduser().resolve() + try: + payload = json.loads(profile_path.read_text(encoding="utf-8")) + except OSError as exc: + raise RuntimeError(f"Failed to read stop rule profile {profile_path}: {exc}") from exc + except json.JSONDecodeError as exc: + raise RuntimeError(f"Invalid stop rule profile JSON {profile_path}: {exc}") from exc + + auto_continue = False + if isinstance(payload, list): + rules_payload = payload + elif isinstance(payload, dict): + auto_continue = bool(payload.get("autoContinueInternalBreakpoints", False)) + rules_payload = payload.get("stopRules", []) + else: + raise RuntimeError("Stop rule profile must be a JSON object or array") + return self._coerce_stop_rules(rules_payload, source=str(profile_path)), auto_continue + + def _coerce_stop_rules(self, raw_rules: Any, *, source: str) -> list[StopRule]: + if not isinstance(raw_rules, list): + raise RuntimeError("stopRules must be a list") + return [StopRule.from_mapping(raw_rule, source=source) for raw_rule in raw_rules] + + def _builtin_internal_stop_rules(self) -> list[StopRule]: + return [ + StopRule( + name="nvidia-shader-jit-debug-register", + action="continue", + origin="internalTrap", + reason="breakpoint", + regex=r"(__jit_debug_register_code|jit-debug-register)", + source="builtin:autoContinueInternalBreakpoints", + ), + StopRule( + name="windows-debugger-startup-breakpoint", + action="continue", + origin="internalTrap", + regex=r"(Exception 0x80000003|ntdll\.dll`DbgBreakPoint|DbgBreakPoint)", + source="builtin:autoContinueInternalBreakpoints", + ), + ] + + def _emit_execution_event(self, default_reason: str | None = None): + info = self._ensure_session().process_info() + state = info.get("state") + if state in {"running", "launching", "stepping"}: + self._send_continued_event(info.get("selected_thread_id")) + return + if state == "exited": + self._send_event("exited", {"exitCode": info.get("exit_status", 0) or 0}) + self._send_event("terminated") + return + if state == "detached": + self._send_event("terminated") + return + + stop = info.get("stop") or {} + lldb_reason = stop.get("reason") + reason = "entry" if default_reason == "entry" else (lldb_reason or default_reason or "pause") + if reason in {"signal", "crashed"}: + reason = "exception" + stop_origin = "debuggee" + if self._pause_requested: + self._pause_requested = False + reason = "pause" + stop_origin = "manualPause" + + frame = stop.get("frame") if isinstance(stop.get("frame"), dict) else {} + stop_context = { + "reason": reason, + "lldbReason": lldb_reason, + "description": stop.get("description"), + "module": stop.get("module") or frame.get("module"), + "modulePath": frame.get("module_path"), + "function": stop.get("function") or frame.get("function"), + "frame": frame, + } + matched_rule = None if stop_origin == "manualPause" else self._match_stop_rule(stop_context) + if matched_rule is not None: + stop_origin = matched_rule.origin + + body = { + "reason": reason, + "threadId": info.get("selected_thread_id"), + "allThreadsStopped": True, + "cliAnythingStop": { + "origin": stop_origin, + "lldbReason": lldb_reason, + "module": stop_context["module"], + "modulePath": stop_context["modulePath"], + "function": stop_context["function"], + "description": stop_context["description"], + }, + } + if frame: + body["cliAnythingStop"]["frame"] = frame + if matched_rule is not None: + body["cliAnythingStop"]["matchedRule"] = matched_rule.to_dap() + hit_ids = stop.get("hit_breakpoint_ids") or [] + if hit_ids: + body["hitBreakpointIds"] = hit_ids + if stop.get("description"): + body["description"] = stop["description"] + body["text"] = stop["description"] + if matched_rule is not None and matched_rule.action == "continue": + self._send_event( + "output", + { + "category": "console", + "output": ( + f"auto-continued stop rule {matched_rule.name}: " + f"{self._summarize_stop(body)}\n" + ), + }, + ) + self._send_continued_event(info.get("selected_thread_id")) + self._start_continue_thread( + name="cli-anything-lldb-dap-auto-continue", + default_reason=default_reason or "breakpoint", + ) + return + self._send_event("stopped", body) + + def _match_stop_rule(self, stop_context: dict[str, Any]) -> StopRule | None: + for rule in self._active_stop_rules: + if rule.matches(stop_context): + return rule + return None + + def _summarize_stop(self, body: dict[str, Any]) -> str: + text = str(body.get("description") or body.get("text") or body.get("reason") or "unknown") + return text.splitlines()[0] if text else "unknown" + + def _send_continued_event(self, thread_id: int | None = None): + body: dict[str, Any] = {"allThreadsContinued": True} + if thread_id is not None: + body["threadId"] = thread_id + self._send_event("continued", body) + + def _cleanup_session(self): + if self._session is not None: + try: + self._session.destroy() + finally: + self._session = None + + def _emit_breakpoint_updates(self): + for bp in self._ensure_session().breakpoint_list().get("breakpoints", []): + self._send_event( + "breakpoint", + {"reason": "changed", "breakpoint": self._to_dap_breakpoint(bp)}, + ) + + def _to_dap_breakpoint( + self, + payload: dict[str, Any], + source_path: str | None = None, + requested_line: int | None = None, + ) -> dict[str, Any]: + details = payload.get("location_details") or [] + first = details[0] if details else {} + path = first.get("file") or source_path + line = first.get("line") or requested_line or 0 + dap_bp = { + "id": payload.get("id"), + "verified": bool(payload.get("resolved")), + "line": line, + } + if path: + dap_bp["source"] = self._source_from_path(path) + if first.get("address"): + dap_bp["instructionReference"] = first["address"] + if not dap_bp["verified"]: + dap_bp["message"] = "Breakpoint is pending and has no resolved LLDB locations yet." + return dap_bp + + def _source_from_path(self, path: str | None) -> dict[str, Any] | None: + if not path: + return None + source_path = str(path) + return {"name": Path(source_path).name, "path": source_path} + + def _ensure_session(self) -> LLDBSession: + if self._session is None: + self._session = self._session_factory() + return self._session + + def _alloc_frame_ref(self, thread_id: int, frame_index: int) -> int: + ref = self._next_ref + self._next_ref += 1 + self._frame_refs[ref] = (thread_id, frame_index) + return ref + + def _alloc_variable_ref(self, entry: dict[str, Any]) -> int: + ref = self._next_ref + self._next_ref += 1 + self._variable_refs[ref] = entry + return ref + + def _dap_variables_from_values(self, values) -> list[dict[str, Any]]: + return [self._dap_variable_from_value(value) for value in values if value and value.IsValid()] + + def _dap_variable_from_value(self, value) -> dict[str, Any]: + variables_ref = 0 + if value.GetNumChildren() > 0: + variables_ref = self._alloc_variable_ref({"kind": "children", "value": value}) + payload = { + "name": value.GetName() or "", + "value": self._value_display(value), + "type": value.GetTypeName(), + "variablesReference": variables_ref, + } + evaluate_name = self._value_expression_path(value) + if evaluate_name: + payload["evaluateName"] = evaluate_name + return payload + + def _child_values(self, value) -> list[Any]: + return [value.GetChildAtIndex(index) for index in range(value.GetNumChildren())] + + def _value_display(self, value) -> str: + raw = value.GetValue() + summary = value.GetSummary() + if raw and summary: + return f"{raw} {summary}" + return raw or summary or "" + + def _value_expression_path(self, value) -> str | None: + try: + stream = self._ensure_session()._lldb.SBStream() + value.GetExpressionPath(stream) + text = stream.GetData() + return text or value.GetName() + except Exception: + return value.GetName() + + def _reset_refs_for_resume(self): + self._frame_refs.clear() + self._variable_refs.clear() + + def _coerce_args(self, raw: Any) -> list[str] | None: + if raw is None: + return None + if isinstance(raw, str): + return shlex.split(raw, posix=os.name != "nt") + return [str(item) for item in raw] + + def _coerce_env(self, raw: Any) -> list[str] | None: + if raw is None: + return None + if isinstance(raw, dict): + return [f"{key}={value}" for key, value in raw.items()] + return [str(item) for item in raw] + + def _parse_address(self, value: str) -> int: + return int(value, 0) + + def _send_response( + self, + request_seq: int, + command: str, + body: dict[str, Any] | None = None, + success: bool = True, + message: str | None = None, + ): + with self._protocol_lock: + payload: dict[str, Any] = { + "seq": self._next_seq(), + "type": "response", + "request_seq": request_seq, + "success": success, + "command": command, + } + if body is not None: + payload["body"] = body + if message: + payload["message"] = message + self._write(payload) + + def _send_event(self, event: str, body: dict[str, Any] | None = None): + with self._protocol_lock: + payload: dict[str, Any] = {"seq": self._next_seq(), "type": "event", "event": event} + if body is not None: + payload["body"] = body + self._write(payload) + + def _next_seq(self) -> int: + seq = self._seq + self._seq += 1 + return seq + + def _write(self, payload: dict[str, Any]): + if self._out is None: + raise RuntimeError("DAP output stream is not initialized") + self._out.write(encode_message(payload)) + self._out.flush() + + def _log(self, message: str): + if self._log_file: + with self._log_file.open("a", encoding="utf-8") as handle: + handle.write(message + "\n") + else: + print(message, file=sys.stderr) + + +def main(argv: list[str] | None = None) -> int: + parser = argparse.ArgumentParser(description="Run cli-anything-lldb Debug Adapter Protocol server") + parser.add_argument("--log-file", default=None, help="Optional file for adapter diagnostics") + parser.add_argument("--profile", default=None, help="Optional stop-rule profile JSON loaded at adapter startup") + args = parser.parse_args(argv) + adapter = LLDBDebugAdapter(log_file=args.log_file, profile_file=args.profile) + return adapter.run() + + +if __name__ == "__main__": + raise SystemExit(main(sys.argv[1:])) diff --git a/lldb/agent-harness/cli_anything/lldb/lldb_cli.py b/lldb/agent-harness/cli_anything/lldb/lldb_cli.py index ae5b97b2d..962f792dc 100644 --- a/lldb/agent-harness/cli_anything/lldb/lldb_cli.py +++ b/lldb/agent-harness/cli_anything/lldb/lldb_cli.py @@ -188,14 +188,16 @@ def process_group(): @click.option("--arg", "args", multiple=True, help="Launch argument. Repeat for multiple.") @click.option("--env", "envs", multiple=True, help="Environment entry KEY=VALUE.") @click.option("--cwd", "working_dir", type=click.Path(exists=True), default=None) +@click.option("--stop-at-entry", is_flag=True, help="Stop at the process entry point before user code.") @click.pass_context -def process_launch(ctx, args, envs, working_dir): +def process_launch(ctx, args, envs, working_dir, stop_at_entry): """Launch process for current target.""" try: data = _require_target().launch( args=list(args) or None, env=list(envs) or None, working_dir=working_dir, + stop_at_entry=stop_at_entry, ) _output(ctx, data) except Exception as exc: @@ -233,6 +235,17 @@ def process_continue(ctx): _handle_exc(ctx, exc) +@process_group.command("interrupt") +@click.pass_context +def process_interrupt(ctx): + """Interrupt a running process.""" + try: + data = _require_process().interrupt() + _output(ctx, data) + except Exception as exc: + _handle_exc(ctx, exc) + + @process_group.command("detach") @click.pass_context def process_detach(ctx): @@ -270,8 +283,9 @@ def breakpoint_group(): @click.option("--line", type=int, default=None) @click.option("--function", type=str, default=None) @click.option("--condition", type=str, default=None) +@click.option("--allow-pending", is_flag=True, help="Allow unresolved pending breakpoints.") @click.pass_context -def breakpoint_set(ctx, file_path, line, function, condition): +def breakpoint_set(ctx, file_path, line, function, condition, allow_pending): """Set a breakpoint by file/line or function.""" try: data = _require_target().breakpoint_set( @@ -279,6 +293,7 @@ def breakpoint_set(ctx, file_path, line, function, condition): line=line, function=function, condition=condition, + allow_pending=allow_pending, ) _output(ctx, data) except Exception as exc: @@ -562,6 +577,26 @@ def core_load(ctx, core_path: str): _handle_exc(ctx, exc) +# =========================================================================== +# dap +# =========================================================================== + + +@cli.command("dap") +@click.option("--log-file", default=None, type=click.Path(dir_okay=False), help="Optional file for adapter diagnostics.") +@click.option("--profile", default=None, type=click.Path(exists=True, dir_okay=False), help="Stop-rule profile JSON.") +def dap_server(log_file: str | None, profile: str | None): + """Run a stdio Debug Adapter Protocol server.""" + from cli_anything.lldb.dap import main as dap_main + + args = [] + if log_file: + args.extend(["--log-file", log_file]) + if profile: + args.extend(["--profile", profile]) + dap_main(args) + + # =========================================================================== # session # =========================================================================== @@ -606,13 +641,13 @@ def repl(ctx): """Start interactive REPL session.""" from cli_anything.lldb.utils.repl_skin import ReplSkin - skin = ReplSkin("lldb", version="0.1.0") + skin = ReplSkin("lldb", version="1.0.0") skin.print_banner() pt_session = skin.create_prompt_session() repl_commands = { "target": "create|info", - "process": "launch|attach|continue|detach|info", + "process": "launch|attach|continue|interrupt|detach|info", "breakpoint": "set|list|delete|enable|disable", "thread": "list|select|backtrace|info", "frame": "select|info|locals", @@ -620,6 +655,7 @@ def repl(ctx): "expr": "", "memory": "read|find", "core": "load", + "dap": "Run Debug Adapter Protocol server", "session": "info|close", "help": "Show this help", "quit": "Exit REPL", diff --git a/lldb/agent-harness/cli_anything/lldb/skills/SKILL.md b/lldb/agent-harness/cli_anything/lldb/skills/SKILL.md index 08afc4727..563c28e39 100644 --- a/lldb/agent-harness/cli_anything/lldb/skills/SKILL.md +++ b/lldb/agent-harness/cli_anything/lldb/skills/SKILL.md @@ -1,7 +1,7 @@ --- name: cli-anything-lldb description: Stateful LLDB debugging via LLDB Python API -version: 0.1.0 +version: 1.0.0 command: cli-anything-lldb install: pip install cli-anything-lldb requires: @@ -28,6 +28,7 @@ Use this CLI to run structured LLDB debugging workflows with JSON output. - Read/find process memory - Load core dumps - Interactive REPL with persistent session state +- Formal stdio Debug Adapter Protocol server for AI/editor clients ## Quick Commands @@ -35,13 +36,70 @@ Use this CLI to run structured LLDB debugging workflows with JSON output. cli-anything-lldb --json target create --exe /path/to/exe cli-anything-lldb --json process launch --arg foo --arg bar cli-anything-lldb --json breakpoint set --function main +cli-anything-lldb --json breakpoint set --function PluginEntry --allow-pending cli-anything-lldb --json process continue +cli-anything-lldb --json process interrupt cli-anything-lldb --json thread backtrace --limit 20 cli-anything-lldb --json frame locals cli-anything-lldb --json expr "myVar" cli-anything-lldb --json memory read --address 0x1000 --size 64 +cli-anything-lldb --json session close ``` +## Debug Adapter Protocol + +Use the DAP entry point when an AI client needs a real debug adapter lifecycle +instead of shelling out separate CLI commands: + +```bash +cli-anything-lldb-dap +cli-anything-lldb-dap --profile /path/to/stop-rules.json +``` + +or: + +```bash +cli-anything-lldb dap +cli-anything-lldb dap --profile /path/to/stop-rules.json +``` + +The DAP server speaks stdio `Content-Length` frames and must have exclusive +stdout. Do not print logs to stdout around it. Supported requests include +`initialize`, `launch`, `attach`, `configurationDone`, `setBreakpoints`, +`setFunctionBreakpoints`, `threads`, `stackTrace`, `scopes`, `variables`, +`setVariable`, `evaluate`, `continue`, `pause`, `next`, `stepIn`, `stepOut`, +`source`, `loadedSources`, `readMemory`, `modules`, `exceptionInfo`, +`disassemble`, and `disconnect`. + +DAP variables can expose child references for structs/classes/arrays. Use +`setVariable` only while stopped; LLDB may reject writes to optimized-out or +read-only values. + +For long-running GUI debuggees, DAP `continue` is non-blocking from the client's +point of view: the adapter sends the response and `continued` event first, then +waits for LLDB on a background thread. DAP `pause` uses LLDB async interrupt. +If an agent needs to change breakpoints while the debuggee is running, the +adapter interrupts first and waits for a stopped state before mutating LLDB +breakpoints; if the target does not stop in time, retry after an explicit +`pause`/`stopped` cycle. + +For GUI apps that stop on debugger-internal startup or shader-JIT breakpoints, +`launch` and `attach` accept the non-standard boolean argument +`autoContinueInternalBreakpoints`. Enable it only when those internal stops are +noise for the task; the adapter emits an `output` event before auto-continuing. +For target-specific noise, prefer structured stop rules through inline +`stopRules` or an external `stopRuleProfile`/`--profile` JSON file. Rules can +match by `reason`, `module`, `function`, and/or `regex`, then either `stop` with +clear `cliAnythingStop.origin` metadata or `continue` automatically. Use +profiles for apps such as C4D so their NVIDIA shader-JIT/startup traps live +outside the generic adapter. + +DAP `stopped` events include `body.cliAnythingStop.origin`: `manualPause` for a +client pause request, `internalTrap` for a matched internal rule, and `debuggee` +for ordinary program stops. Existing `cli-anything-lldb-dap` processes do not +hot-load new code or profile contents; restart the adapter and re-attach or +re-launch before expecting new rules to apply. + ## Command Groups ### target @@ -52,10 +110,11 @@ cli-anything-lldb --json target info ### process ```bash -cli-anything-lldb --json process launch [--arg ARG ...] [--env KEY=VALUE ...] [--cwd DIR] +cli-anything-lldb --json process launch [--arg ARG ...] [--env KEY=VALUE ...] [--cwd DIR] [--stop-at-entry] cli-anything-lldb --json process attach --pid 1234 cli-anything-lldb --json process attach --name myapp --wait-for cli-anything-lldb --json process continue +cli-anything-lldb --json process interrupt cli-anything-lldb --json process detach cli-anything-lldb --json process info ``` @@ -64,6 +123,7 @@ cli-anything-lldb --json process info ```bash cli-anything-lldb --json breakpoint set --function main cli-anything-lldb --json breakpoint set --file main.c --line 42 --condition "i > 10" +cli-anything-lldb --json breakpoint set --function LateLoadedSymbol --allow-pending cli-anything-lldb --json breakpoint list cli-anything-lldb --json breakpoint delete --id 1 cli-anything-lldb --json breakpoint enable --id 1 @@ -94,10 +154,15 @@ cli-anything-lldb --json core load --path /path/to/core ## Agent Usage Notes - Prefer `--json` for all automated flows. -- Non-REPL commands share state across separate invocations through the persistent session daemon until you run `session close` or the idle timeout expires. -- Use REPL when you want an interactive long-running debugger session: - - run `cli-anything-lldb` - - execute multi-step commands in one session +- Separate non-REPL invocations share a persistent session daemon by default. +- Use `--session-file PATH` or `CLI_ANYTHING_LLDB_SESSION_FILE` to pin an explicit session for a task. +- Run `cli-anything-lldb --json session close` when finished so attached processes detach and launched debuggees are cleaned up. +- Use REPL when a human-like interactive shell is more convenient, not because persistence requires it. +- Unresolved CLI breakpoints fail by default; pass `--allow-pending` only when a future module/symbol load is expected. +- DAP unresolved breakpoints use protocol semantics: `verified: false` until resolved. +- DAP `continue` is non-blocking for long-running GUI processes, and DAP `pause` uses async interrupt. +- DAP breakpoint changes during an active continue first interrupt and wait for a stopped state before mutating LLDB. +- Use DAP stop-rule profiles for app-specific internal traps; restart and re-attach/re-launch after profile changes. - `memory find` uses a chunked scan capped at 1 MiB per call. - Call `target create` before process or core commands. - Expect structured errors: `{"error": "...", "type": "..."}` diff --git a/lldb/agent-harness/cli_anything/lldb/tests/TEST.md b/lldb/agent-harness/cli_anything/lldb/tests/TEST.md index 4a633a300..230dcf7c9 100644 --- a/lldb/agent-harness/cli_anything/lldb/tests/TEST.md +++ b/lldb/agent-harness/cli_anything/lldb/tests/TEST.md @@ -2,8 +2,8 @@ ## Test Inventory Plan -- `test_core.py`: persistent session + lifecycle unit tests -- `test_full_e2e.py`: persistent workflow / attach cleanup / optional core-load E2E tests +- `test_core.py`: DAP framing, daemon security, persistent session, breakpoint semantics, pause/interrupt, and lifecycle unit tests +- `test_full_e2e.py`: persistent CLI workflow, DAP workflow, attach cleanup, and optional core-load E2E tests ## Unit Test Plan @@ -26,9 +26,32 @@ ### `core/session.py` - Validate target/process guards and high-level wrappers with mocked LLDB objects - Validate breakpoint set/list/delete/enable operations +- Validate unresolved breakpoints fail by default and explicit pending breakpoints report `resolved=false` - Validate step/continue/backtrace/locals/evaluate return schemas - Validate thread/frame select logic - Validate cleanup semantics for attached vs launched inferiors +- Validate interrupt maps to `SBProcess.Stop()` + +### `utils/session_server.py` +- Validate session state files are written with restrictive permissions where the platform supports them +- Validate the persistent daemon rejects methods outside the explicit RPC allowlist + +### `dap.py` +- Validate DAP `Content-Length` framing and malformed-frame errors +- Validate initialize capabilities and `initialized` event emission +- Validate frame/variable references are cleared on resume +- Validate EOF cleanup destroys the LLDB session +- Validate running-state execution events emit `continued`, not a false `stopped` +- Validate DAP `pause` calls the async interrupt path and emits a pause stop event +- Validate DAP breakpoint mutation during an active continue requests async interrupt and waits for stopped state +- Validate DAP breakpoint mutation reports a clear timeout when the target does not stop +- Validate DAP auto-continues known internal JIT/startup breakpoint stops when explicitly enabled +- Validate structured DAP stop rules can match by module/function/reason/regex and classify internal traps +- Validate external DAP stop-rule profile files inject target-specific auto-continue rules +- Validate DAP stopped events distinguish manual pauses, internal traps, and ordinary debuggee stops +- Validate DAP transcript response/event ordering for initialize, launch, breakpoint setup, and configuration completion +- Validate DAP `modules` and `exceptionInfo` response shapes +- Validate DAP `readMemory` base64 encoding and expandable variable references ### `lldb_cli.py` - Validate `--help` for root and command groups @@ -46,11 +69,15 @@ ### Workflows to validate - Create target in one command, read target info in a later command via the same persisted session - Set breakpoint -> launch -> inspect threads/backtrace/locals -> evaluate expression -> read/find memory -> step -> continue +- Run DAP initialize -> launch -> setFunctionBreakpoints -> configurationDone -> stopped -> threads -> stackTrace -> scopes -> variables -> setVariable -> evaluate -> source -> loadedSources -> readMemory -> modules -> exceptionInfo -> disassemble -> step/continue +- Run DAP `setBreakpoints` with a real source line and verify the breakpoint resolves and stops +- Run DAP stop-on-entry and verify the stopped event reports `reason=entry` - Attach to a live process, then close the LLDB session without killing the attached process - Load core dump negative path without a target selected, using either a provided `LLDB_TEST_CORE` path or an auto-generated placeholder file ### Output validation - All command responses parse as valid JSON in `--json` mode +- DAP stdout contains only DAP frames, even when the debuggee writes stdout - Required keys exist (`pid`, `state`, `breakpoints`, `threads`, `frames`, etc.) - Commands fail with structured error payloads when prerequisites are missing @@ -84,6 +111,46 @@ - Verified: - attached process remains alive after the debugger session closes +### Workflow name: `dap_probe_session` +- Simulates: an AI debug client driving LLDB through DAP instead of shell commands +- Operations chained: + 1. `initialize` + 2. `launch` + 3. `setFunctionBreakpoints` + 4. `configurationDone` + 5. `threads` + 6. `stackTrace` + 7. `scopes` + 8. `variables` + 9. `setVariable` + 10. `evaluate` + 11. `source` + 12. `loadedSources` + 13. `readMemory` + 14. `modules` + 15. `exceptionInfo` + 16. `disassemble` + 17. `next` + 18. `continue` +- Verified: + - DAP lifecycle events and stopped reasons + - locals and expression evaluation through DAP frame ids + - struct child expansion and stopped-frame variable assignment + - source/disassembly inspection + - DAP memory reads, loaded source discovery, module listing, and exception info + - no debuggee stdout contamination of DAP stdout + +### Workflow name: `dap_source_line_breakpoint` +- Simulates: an editor or AI debug client setting a source file/line breakpoint +- Operations chained: + 1. `initialize` + 2. `launch` + 3. `setBreakpoints` + 4. `configurationDone` +- Verified: + - source line breakpoint resolves to `verified=true` + - process stops for the breakpoint through DAP + ## Test Results ### Commands run @@ -96,17 +163,27 @@ python -m pytest cli_anything/lldb/tests -q ### Result summary -- `test_core.py`: 23 passed -- `test_full_e2e.py`: 4 passed -- combined: 27 passed +- `test_core.py`: 46 passed +- `test_full_e2e.py`: 7 passed, 2 warnings from LLDB SWIG bindings +- combined default run: 53 passed, 2 warnings from LLDB SWIG bindings +- skip situation: 0 skipped in the current local run; older runs could skip the optional core-load negative-path scenario when `LLDB_TEST_CORE` was unset, but the fixture now creates a local placeholder core path for that negative-path test ### Notes - Verified the installed `cli-anything-lldb` entrypoint on Windows after editable install -- The core-load negative-path test now auto-generates a placeholder file, so no extra env var is required for the default E2E suite +- The core-load negative-path test auto-generates a placeholder file, so no extra env var is required for the default E2E suite - Fixed REPL fallback behavior for non-interactive subprocess execution on Windows - Fixed Windows REPL command parsing so quoted paths and inherited `--json` mode work correctly - Added a persistent background LLDB session so non-REPL commands can share debugger state - Switched the session daemon to a localhost JSON socket protocol with owner-scoped state file permissions - `memory find` now uses a chunked scan capped at 1 MiB per call - Fixed cleanup to detach attached inferiors instead of killing them on session shutdown +- Hardened the persistent daemon state file and RPC method surface +- Added honest breakpoint resolution reporting and explicit pending breakpoint opt-in +- Added a stdio DAP adapter with stop-at-entry, breakpoint, stack, locals, expression, source, disassembly, step, and continue coverage +- Added DAP/CLI interrupt support and tightened DAP lifecycle cleanup/running-state event behavior +- Added a real DAP source-line breakpoint E2E scenario +- Added DAP `loadedSources` and `readMemory` coverage while keeping the harness at version 1.0.0 +- Added DAP variable child expansion, `setVariable`, `modules`, `exceptionInfo`, and transcript ordering coverage while keeping the harness at version 1.0.0 +- Added non-blocking DAP continue, async pause, guarded running-state breakpoint mutation, and internal breakpoint auto-continue coverage +- Added DAP structured stop-rule profile coverage and `cliAnythingStop` stopped-event metadata for manual pause vs internal trap classification diff --git a/lldb/agent-harness/cli_anything/lldb/tests/test_core.py b/lldb/agent-harness/cli_anything/lldb/tests/test_core.py index 61ca867f2..d4b3338de 100644 --- a/lldb/agent-harness/cli_anything/lldb/tests/test_core.py +++ b/lldb/agent-harness/cli_anything/lldb/tests/test_core.py @@ -8,8 +8,11 @@ import json import os +import io import subprocess import sys +import stat +import threading from pathlib import Path from unittest.mock import MagicMock, patch @@ -80,15 +83,502 @@ def test_handle_error_debug(self): assert "boom" in result["traceback"] +class TestDAPProtocol: + def test_encode_and_read_message(self): + from cli_anything.lldb.dap import encode_message, read_message + + payload = {"seq": 1, "type": "request", "command": "initialize"} + stream = io.BytesIO(encode_message(payload)) + + assert read_message(stream) == payload + assert read_message(stream) is None + + def test_read_message_rejects_missing_content_length(self): + from cli_anything.lldb.dap import DAPProtocolError, read_message + + with pytest.raises(DAPProtocolError, match="Missing Content-Length"): + read_message(io.BytesIO(b"Header: value\r\n\r\n{}")) + + def test_initialize_capabilities_and_event(self): + from cli_anything.lldb.dap import LLDBDebugAdapter, encode_message, read_message + + out = io.BytesIO() + adapter = LLDBDebugAdapter(session_factory=MagicMock()) + adapter.run( + io.BytesIO(encode_message({"seq": 1, "type": "request", "command": "initialize", "arguments": {}})), + out, + ) + out.seek(0) + response = read_message(out) + event = read_message(out) + + assert response["success"] is True + assert response["body"]["supportsConfigurationDoneRequest"] is True + assert response["body"]["supportsFunctionBreakpoints"] is True + assert response["body"]["supportsLoadedSourcesRequest"] is True + assert response["body"]["supportsReadMemoryRequest"] is True + assert response["body"]["supportsSetVariable"] is True + assert response["body"]["supportsModulesRequest"] is True + assert response["body"]["supportsExceptionInfoRequest"] is True + assert event["event"] == "initialized" + + def test_variable_references_reset_on_resume(self): + from cli_anything.lldb.dap import LLDBDebugAdapter + + adapter = LLDBDebugAdapter(session_factory=MagicMock()) + frame_ref = adapter._alloc_frame_ref(1, 0) + variable_ref = adapter._alloc_variable_ref({"kind": "locals", "frame_ref": frame_ref}) + + adapter._reset_refs_for_resume() + + assert frame_ref not in adapter._frame_refs + assert variable_ref not in adapter._variable_refs + + def test_run_cleans_up_session_on_eof(self): + from cli_anything.lldb.dap import LLDBDebugAdapter + + fake_session = MagicMock() + adapter = LLDBDebugAdapter(session_factory=lambda: fake_session) + adapter._ensure_session() + + result = adapter.run(io.BytesIO(), io.BytesIO()) + + assert result == 0 + fake_session.destroy.assert_called_once() + assert adapter._session is None + + def test_running_state_emits_continued_not_stopped(self): + from cli_anything.lldb.dap import LLDBDebugAdapter, read_message + + fake_session = MagicMock() + fake_session.process_info.return_value = { + "state": "running", + "selected_thread_id": 99, + "stop": None, + "exit_status": 0, + } + out = io.BytesIO() + adapter = LLDBDebugAdapter(session_factory=lambda: fake_session) + adapter._out = out + adapter._ensure_session() + + adapter._emit_execution_event(default_reason="breakpoint") + out.seek(0) + event = read_message(out) + + assert event["event"] == "continued" + assert event["body"]["threadId"] == 99 + + def test_pause_request_interrupts_process_and_reports_stop(self): + from cli_anything.lldb.dap import LLDBDebugAdapter, read_message + + fake_session = MagicMock() + fake_session.process_info.return_value = { + "state": "stopped", + "selected_thread_id": 99, + "stop": {"reason": None, "description": None, "hit_breakpoint_ids": []}, + "exit_status": 0, + } + out = io.BytesIO() + adapter = LLDBDebugAdapter(session_factory=lambda: fake_session) + adapter.run( + io.BytesIO( + b"".join( + [ + __import__("cli_anything.lldb.dap", fromlist=["encode_message"]).encode_message( + {"seq": 1, "type": "request", "command": "pause", "arguments": {"threadId": 99}} + ) + ] + ) + ), + out, + ) + out.seek(0) + response = read_message(out) + event = read_message(out) + + assert response["success"] is True + assert event["event"] == "stopped" + assert event["body"]["reason"] == "pause" + assert event["body"]["cliAnythingStop"]["origin"] == "manualPause" + fake_session.interrupt_async.assert_called_once() + + def test_set_breakpoints_interrupts_active_continue_before_mutation(self): + from cli_anything.lldb.dap import LLDBDebugAdapter + + fake_session = MagicMock() + fake_session.breakpoint_set.return_value = { + "id": 7, + "resolved": True, + "locations": 1, + "location_details": [{"file": "C:/tmp/main.c", "line": 12}], + } + adapter = LLDBDebugAdapter(session_factory=lambda: fake_session) + adapter._ensure_session() + adapter._mutation_stop_timeout = 1.0 + with adapter._continue_state: + adapter._continue_active = True + + release = threading.Timer(0.01, adapter._mark_continue_inactive) + release.start() + try: + body, post_send = adapter._handle_setBreakpoints( + { + "source": {"path": "C:/tmp/main.c"}, + "breakpoints": [{"line": 12}], + } + ) + finally: + release.join() + + assert post_send is None + assert body["breakpoints"][0]["verified"] is True + fake_session.interrupt_async.assert_called_once() + fake_session.breakpoint_set.assert_called_once() + + def test_set_breakpoints_reports_timeout_if_running_target_will_not_stop(self): + from cli_anything.lldb.dap import LLDBDebugAdapter + + fake_session = MagicMock() + adapter = LLDBDebugAdapter(session_factory=lambda: fake_session) + adapter._ensure_session() + adapter._mutation_stop_timeout = 0.01 + with adapter._continue_state: + adapter._continue_active = True + + try: + with pytest.raises(RuntimeError, match="Timed out waiting for debuggee to stop"): + adapter._handle_setBreakpoints( + { + "source": {"path": "C:/tmp/main.c"}, + "breakpoints": [{"line": 12}], + } + ) + finally: + adapter._mark_continue_inactive() + + fake_session.interrupt_async.assert_called_once() + fake_session.breakpoint_set.assert_not_called() + + def test_auto_continue_internal_breakpoint_emits_output_and_resumes(self): + from cli_anything.lldb.dap import LLDBDebugAdapter, read_message + + fake_session = MagicMock() + fake_session.process_info.return_value = { + "state": "stopped", + "selected_thread_id": 99, + "stop": { + "reason": "breakpoint", + "description": "frame #0: nvgpucomp64.dll`__jit_debug_register_code", + "hit_breakpoint_ids": [], + }, + "exit_status": 0, + } + out = io.BytesIO() + adapter = LLDBDebugAdapter(session_factory=lambda: fake_session) + adapter._out = out + adapter._configure_stop_rules({"autoContinueInternalBreakpoints": True}) + adapter._start_continue_thread = MagicMock() + + adapter._emit_execution_event(default_reason="breakpoint") + out.seek(0) + output_event = read_message(out) + continued_event = read_message(out) + stopped_event = read_message(out) + + assert output_event["event"] == "output" + assert "auto-continued stop rule nvidia-shader-jit-debug-register" in output_event["body"]["output"] + assert continued_event["event"] == "continued" + assert stopped_event is None + adapter._start_continue_thread.assert_called_once() + + def test_stop_rule_profile_can_auto_continue_structured_internal_stop(self, tmp_path: Path): + from cli_anything.lldb.dap import LLDBDebugAdapter, read_message + + profile = tmp_path / "c4d-stop-rules.json" + profile.write_text( + json.dumps( + { + "stopRules": [ + { + "name": "c4d-nvidia-jit", + "action": "continue", + "origin": "internalTrap", + "module": "nvgpucomp64.dll", + "function": "__jit_debug_register_code", + } + ] + } + ), + encoding="utf-8", + ) + fake_session = MagicMock() + fake_session.process_info.return_value = { + "state": "stopped", + "selected_thread_id": 99, + "stop": { + "reason": "breakpoint", + "description": "driver JIT registration", + "hit_breakpoint_ids": [], + "frame": { + "module": "nvgpucomp64.dll", + "module_path": "C:/Windows/System32/DriverStore/nvgpucomp64.dll", + "function": "__jit_debug_register_code", + }, + }, + "exit_status": 0, + } + out = io.BytesIO() + adapter = LLDBDebugAdapter(session_factory=lambda: fake_session, profile_file=str(profile)) + adapter._out = out + adapter._configure_stop_rules({}) + adapter._start_continue_thread = MagicMock() + + adapter._emit_execution_event(default_reason="breakpoint") + out.seek(0) + output_event = read_message(out) + continued_event = read_message(out) + stopped_event = read_message(out) + + assert "auto-continued stop rule c4d-nvidia-jit" in output_event["body"]["output"] + assert continued_event["event"] == "continued" + assert stopped_event is None + adapter._start_continue_thread.assert_called_once() + + def test_structured_stop_rule_marks_internal_trap_without_continuing(self): + from cli_anything.lldb.dap import LLDBDebugAdapter, read_message + + fake_session = MagicMock() + fake_session.process_info.return_value = { + "state": "stopped", + "selected_thread_id": 99, + "stop": { + "reason": "exception", + "description": "Exception 0x80000003 at ntdll.dll`DbgBreakPoint", + "hit_breakpoint_ids": [], + "module": "ntdll.dll", + "function": "DbgBreakPoint", + }, + "exit_status": 0, + } + out = io.BytesIO() + adapter = LLDBDebugAdapter(session_factory=lambda: fake_session) + adapter._out = out + adapter._configure_stop_rules( + { + "stopRules": [ + { + "name": "windows-startup-trap", + "action": "stop", + "origin": "internalTrap", + "reason": "exception", + "module": "ntdll.dll", + "regex": "DbgBreakPoint", + } + ] + } + ) + adapter._start_continue_thread = MagicMock() + + adapter._emit_execution_event(default_reason="breakpoint") + out.seek(0) + stopped_event = read_message(out) + + assert stopped_event["event"] == "stopped" + stop = stopped_event["body"]["cliAnythingStop"] + assert stop["origin"] == "internalTrap" + assert stop["module"] == "ntdll.dll" + assert stop["function"] == "DbgBreakPoint" + assert stop["matchedRule"]["name"] == "windows-startup-trap" + adapter._start_continue_thread.assert_not_called() + + def test_stack_trace_reports_total_frames(self): + from cli_anything.lldb.dap import LLDBDebugAdapter, read_message + + fake_session = MagicMock() + fake_session.backtrace.return_value = { + "frames": [ + {"index": 0, "function": "main", "file": None, "line": None, "address": "0x1000"}, + ], + "total_frames": 7, + } + out = io.BytesIO() + adapter = LLDBDebugAdapter(session_factory=lambda: fake_session) + adapter.run( + io.BytesIO( + __import__("cli_anything.lldb.dap", fromlist=["encode_message"]).encode_message( + {"seq": 1, "type": "request", "command": "stackTrace", "arguments": {"threadId": 123}} + ) + ), + out, + ) + out.seek(0) + response = read_message(out) + + assert response["success"] is True + assert response["body"]["totalFrames"] == 7 + fake_session.thread_select.assert_called_once_with(123) + + def test_read_memory_response_is_base64(self): + from cli_anything.lldb.dap import LLDBDebugAdapter, read_message + + fake_session = MagicMock() + fake_session.read_memory.return_value = {"address": "0x1000", "size": 3, "hex": "616263"} + out = io.BytesIO() + adapter = LLDBDebugAdapter(session_factory=lambda: fake_session) + adapter.run( + io.BytesIO( + __import__("cli_anything.lldb.dap", fromlist=["encode_message"]).encode_message( + { + "seq": 1, + "type": "request", + "command": "readMemory", + "arguments": {"memoryReference": "0x1000", "count": 3}, + } + ) + ), + out, + ) + out.seek(0) + response = read_message(out) + + assert response["success"] is True + assert response["body"]["address"] == "0x1000" + assert response["body"]["data"] == "YWJj" + fake_session.read_memory.assert_called_once_with(0x1000, 3) + + def test_launch_transcript_keeps_dap_response_event_order(self): + from cli_anything.lldb.dap import LLDBDebugAdapter, encode_message, read_message + + breakpoint_payload = { + "id": 1, + "resolved": True, + "location_details": [], + "locations": 1, + } + fake_session = MagicMock() + fake_session.target_create.return_value = {} + fake_session.breakpoint_set.return_value = breakpoint_payload + fake_session.breakpoint_list.return_value = {"breakpoints": [breakpoint_payload]} + fake_session.launch.return_value = {} + fake_session.process_info.return_value = { + "state": "stopped", + "selected_thread_id": 99, + "stop": {"reason": "breakpoint", "description": "hit breakpoint", "hit_breakpoint_ids": [1]}, + "exit_status": 0, + } + messages = [ + {"seq": 1, "type": "request", "command": "initialize", "arguments": {}}, + {"seq": 2, "type": "request", "command": "launch", "arguments": {"program": "app.exe"}}, + { + "seq": 3, + "type": "request", + "command": "setFunctionBreakpoints", + "arguments": {"breakpoints": [{"name": "main"}]}, + }, + {"seq": 4, "type": "request", "command": "configurationDone", "arguments": {}}, + ] + out = io.BytesIO() + adapter = LLDBDebugAdapter(session_factory=lambda: fake_session) + adapter.run(io.BytesIO(b"".join(encode_message(message) for message in messages)), out) + out.seek(0) + transcript = [] + while True: + message = read_message(out) + if message is None: + break + transcript.append(message) + + labels = [ + item.get("command") if item.get("type") == "response" else item.get("event") + for item in transcript + ] + assert labels == [ + "initialize", + "initialized", + "launch", + "setFunctionBreakpoints", + "configurationDone", + "breakpoint", + "stopped", + ] + assert all(item["type"] in {"response", "event"} for item in transcript) + + def test_modules_response_shape(self): + from cli_anything.lldb.dap import LLDBDebugAdapter, read_message + + fake_session = MagicMock() + fake_session.modules.return_value = { + "modules": [ + { + "id": 1, + "name": "app.exe", + "path": "C:/tmp/app.exe", + "symbol_status": "loaded", + "address": "0x1000", + "version": [1, 2, 3], + } + ] + } + out = io.BytesIO() + adapter = LLDBDebugAdapter(session_factory=lambda: fake_session) + adapter.run( + io.BytesIO( + __import__("cli_anything.lldb.dap", fromlist=["encode_message"]).encode_message( + {"seq": 1, "type": "request", "command": "modules", "arguments": {}} + ) + ), + out, + ) + out.seek(0) + response = read_message(out) + + assert response["success"] is True + module = response["body"]["modules"][0] + assert module["name"] == "app.exe" + assert module["symbolStatus"] == "loaded" + + def test_exception_info_uses_current_stop_reason(self): + from cli_anything.lldb.dap import LLDBDebugAdapter, read_message + + fake_session = MagicMock() + fake_session.process_info.return_value = { + "state": "stopped", + "stop": {"reason": "breakpoint", "description": "breakpoint 1.1"}, + } + out = io.BytesIO() + adapter = LLDBDebugAdapter(session_factory=lambda: fake_session) + adapter.run( + io.BytesIO( + __import__("cli_anything.lldb.dap", fromlist=["encode_message"]).encode_message( + {"seq": 1, "type": "request", "command": "exceptionInfo", "arguments": {"threadId": 1}} + ) + ), + out, + ) + out.seek(0) + response = read_message(out) + + assert response["success"] is True + assert response["body"]["exceptionId"] == "breakpoint" + assert response["body"]["description"] == "breakpoint 1.1" + + class TestCoreHelpers: def test_breakpoints_wrapper(self): from cli_anything.lldb.core.breakpoints import set_breakpoint session = MagicMock() session.breakpoint_set.return_value = {"id": 1} - data = set_breakpoint(session, function="main") + data = set_breakpoint(session, function="main", allow_pending=True) assert data["id"] == 1 - session.breakpoint_set.assert_called_once() + session.breakpoint_set.assert_called_once_with( + file=None, + line=None, + function="main", + condition=None, + allow_pending=True, + ) def test_inspect_wrapper(self): from cli_anything.lldb.core.inspect import evaluate_expression @@ -254,6 +744,40 @@ def test_destroy_kills_launched_process(self): process.Kill.assert_called_once() process.Detach.assert_not_called() + def test_interrupt_stops_process(self): + from cli_anything.lldb.core.session import LLDBSession + + session = self._make_session() + process = MagicMock() + process.IsValid.return_value = True + process.Stop.return_value = MagicMock() + process.Stop.return_value.Success.return_value = True + process.GetState.return_value = 5 + process.GetSelectedThread.return_value = None + process.GetProcessID.return_value = 123 + process.GetNumThreads.return_value = 0 + process.GetExitStatus.return_value = 0 + session.process = process + + payload = LLDBSession.interrupt(session) + + process.Stop.assert_called_once() + assert payload["pid"] == 123 + + def test_interrupt_async_requests_async_interrupt(self): + from cli_anything.lldb.core.session import LLDBSession + + session = self._make_session() + process = MagicMock() + process.IsValid.return_value = True + process.SendAsyncInterrupt.return_value = MagicMock(Success=lambda: True) + session.process = process + + payload = LLDBSession.interrupt_async(session) + + process.SendAsyncInterrupt.assert_called_once() + assert payload == {"status": "interrupt_requested"} + def test_session_status_reports_target_and_process(self): from cli_anything.lldb.core.session import LLDBSession @@ -279,11 +803,20 @@ def test_process_info_public_wrapper(self): process.GetProcessID.return_value = 77 process.GetState.return_value = 5 process.GetNumThreads.return_value = 2 + process.GetSelectedThread.return_value = None + process.GetExitStatus.return_value = 0 session.process = process data = LLDBSession.process_info(session) - assert data == {"pid": 77, "state": "stopped", "num_threads": 2} + assert data == { + "pid": 77, + "state": "stopped", + "num_threads": 2, + "selected_thread_id": None, + "stop": None, + "exit_status": 0, + } def test_find_memory_scans_in_chunks(self): from cli_anything.lldb.core.session import LLDBSession @@ -321,6 +854,74 @@ def test_find_memory_rejects_oversized_scan(self): assert "max supported scan size" in str(exc.value) + def test_unresolved_breakpoint_fails_by_default(self): + from cli_anything.lldb.core.session import LLDBSession + + session = self._make_session() + session.target = MagicMock() + session.target.IsValid.return_value = True + bp = MagicMock() + bp.IsValid.return_value = True + bp.GetID.return_value = 7 + bp.GetNumLocations.return_value = 0 + bp.GetHitCount.return_value = 0 + bp.IsEnabled.return_value = True + bp.GetCondition.return_value = None + session.target.BreakpointCreateByName.return_value = bp + + with pytest.raises(RuntimeError, match="unresolved"): + LLDBSession.breakpoint_set(session, function="missing") + + session.target.BreakpointDelete.assert_called_once_with(7) + + def test_pending_breakpoint_returns_resolution_state(self): + from cli_anything.lldb.core.session import LLDBSession + + session = self._make_session() + session.target = MagicMock() + session.target.IsValid.return_value = True + bp = MagicMock() + bp.IsValid.return_value = True + bp.GetID.return_value = 7 + bp.GetNumLocations.return_value = 0 + bp.GetHitCount.return_value = 0 + bp.IsEnabled.return_value = True + bp.GetCondition.return_value = None + session.target.BreakpointCreateByName.return_value = bp + + payload = LLDBSession.breakpoint_set(session, function="missing", allow_pending=True) + + assert payload["id"] == 7 + assert payload["resolved"] is False + assert payload["locations"] == 0 + session.target.BreakpointDelete.assert_not_called() + + +class TestSessionDaemonSecurity: + def test_state_file_is_written_with_restrictive_mode(self, tmp_path): + from cli_anything.lldb.utils.session_server import _write_state_file + + state_file = tmp_path / "secure" / "session.json" + _write_state_file(state_file, ("127.0.0.1", 1234), b"secret") + + data = json.loads(state_file.read_text(encoding="utf-8")) + assert data["host"] == "127.0.0.1" + assert data["port"] == 1234 + assert data["token"] + if os.name != "nt": + assert stat.S_IMODE(state_file.parent.stat().st_mode) == 0o700 + assert stat.S_IMODE(state_file.stat().st_mode) == 0o600 + + def test_session_server_rejects_unknown_methods(self): + from cli_anything.lldb.utils.session_server import SessionServer + + server = SessionServer() + response, should_stop = server.handle({"method": "__getattribute__", "args": ["debugger"], "kwargs": {}}) + + assert should_stop is False + assert response["ok"] is False + assert "Unsupported session method" in response["error"] + class TestCLISubprocess: CLI_BASE = _resolve_cli("cli-anything-lldb") diff --git a/lldb/agent-harness/cli_anything/lldb/tests/test_full_e2e.py b/lldb/agent-harness/cli_anything/lldb/tests/test_full_e2e.py index 5f544d157..787c63e40 100644 --- a/lldb/agent-harness/cli_anything/lldb/tests/test_full_e2e.py +++ b/lldb/agent-harness/cli_anything/lldb/tests/test_full_e2e.py @@ -8,11 +8,14 @@ from __future__ import annotations import json +import base64 import os +import queue import re import shutil import subprocess import sys +import threading from pathlib import Path import pytest @@ -34,8 +37,14 @@ char GLOBAL_BUFFER[] = "agent-native-lldb"; +struct Pair { + int left; + int right; +}; + int probe(int a, int b) { - int total = a + b; + struct Pair pair = {a, b}; + int total = pair.left + pair.right; pause_ms(50); return GLOBAL_BUFFER[0] + total; } @@ -159,6 +168,92 @@ def _extract_address(payload: dict) -> str: raise AssertionError(f"Could not extract address from payload: {payload}") +class DAPClient: + def __init__(self): + from cli_anything.lldb.dap import read_message + + self._read_message = read_message + self.proc = subprocess.Popen( + [sys.executable, "-m", "cli_anything.lldb.dap"], + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + cwd=HARNESS_ROOT, + ) + self.seq = 1 + self.messages: queue.Queue = queue.Queue() + self.reader = threading.Thread(target=self._reader_loop, daemon=True) + self.reader.start() + + def _reader_loop(self): + assert self.proc.stdout is not None + try: + while True: + msg = self._read_message(self.proc.stdout) + if msg is None: + return + self.messages.put(msg) + except Exception as exc: + self.messages.put(exc) + + def request(self, command: str, arguments: dict | None = None, timeout: int = 30): + from cli_anything.lldb.dap import encode_message + + seq = self.seq + self.seq += 1 + payload = {"seq": seq, "type": "request", "command": command} + if arguments is not None: + payload["arguments"] = arguments + assert self.proc.stdin is not None + self.proc.stdin.write(encode_message(payload)) + self.proc.stdin.flush() + + events = [] + while True: + msg = self._next_message(timeout) + if msg.get("type") == "response" and msg.get("request_seq") == seq: + assert msg.get("success"), msg.get("message") + return msg, events + events.append(msg) + + def read_event(self, name: str, timeout: int = 30): + while True: + msg = self._next_message(timeout) + if msg.get("type") == "event" and msg.get("event") == name: + return msg + + def read_until_event(self, names: set[str], timeout: int = 30): + while True: + msg = self._next_message(timeout) + if msg.get("type") == "event" and msg.get("event") in names: + return msg + + def _next_message(self, timeout: int): + item = self.messages.get(timeout=timeout) + if isinstance(item, Exception): + raise item + return item + + def close(self): + if self.proc.poll() is None: + try: + self.request("disconnect", {"terminateDebuggee": True}, timeout=10) + self.read_until_event({"terminated"}, timeout=10) + except Exception: + self.proc.terminate() + try: + self.proc.wait(timeout=10) + except subprocess.TimeoutExpired: + self.proc.kill() + self.proc.wait(timeout=10) + + def __enter__(self): + return self + + def __exit__(self, _exc_type, _exc, _tb): + self.close() + + @skip_no_lldb class TestLLDBE2E: def test_persistent_target_info(self, lldb_test_exe: str, session_file: Path): @@ -233,6 +328,143 @@ def test_attach_cleanup_does_not_kill_process(self, lldb_test_exe: str, session_ proc.wait(timeout=5) +@skip_no_lldb +class TestLLDBDAPE2E: + def test_dap_source_line_breakpoint(self, lldb_test_exe: str): + source_path = Path(lldb_test_exe).parent / "lldb_helper.c" + lines = source_path.read_text(encoding="utf-8").splitlines() + target_line = next(i for i, line in enumerate(lines, start=1) if "pause_ms(50);" in line) + + with DAPClient() as client: + client.request("initialize", {"adapterID": "cli-anything-lldb"}) + client.read_event("initialized") + client.request("launch", {"program": lldb_test_exe, "stopOnEntry": False}) + bps, _ = client.request( + "setBreakpoints", + { + "source": {"path": str(source_path)}, + "breakpoints": [{"line": target_line}], + }, + ) + breakpoint_payload = bps["body"]["breakpoints"][0] + assert breakpoint_payload["verified"] is True + assert breakpoint_payload["line"] == target_line + + client.request("configurationDone") + stopped = client.read_until_event({"stopped"}) + + assert stopped["body"]["reason"] == "breakpoint" + threads, _ = client.request("threads") + thread_id = threads["body"]["threads"][0]["id"] + stack, _ = client.request("stackTrace", {"threadId": thread_id, "levels": 10}) + frame = stack["body"]["stackFrames"][0] + scopes, _ = client.request("scopes", {"frameId": frame["id"]}) + variables_ref = scopes["body"]["scopes"][0]["variablesReference"] + variables, _ = client.request("variables", {"variablesReference": variables_ref}) + variables_by_name = {item["name"]: item for item in variables["body"]["variables"]} + assert variables_by_name["pair"]["variablesReference"] > 0 + + pair_children, _ = client.request( + "variables", + {"variablesReference": variables_by_name["pair"]["variablesReference"]}, + ) + pair_values = {item["name"]: item["value"] for item in pair_children["body"]["variables"]} + assert pair_values["left"] in {"2", "0x2"} + assert pair_values["right"] in {"40", "0x28"} + + set_total, _ = client.request( + "setVariable", + { + "variablesReference": variables_ref, + "name": "total", + "value": "77", + }, + ) + assert set_total["body"]["value"] in {"77", "0x4d"} + total_eval, _ = client.request("evaluate", {"expression": "total", "frameId": frame["id"]}) + assert total_eval["body"]["result"] in {"77", "0x4d"} + + def test_dap_breakpoint_variables_source_disassemble_and_continue(self, lldb_test_exe: str): + with DAPClient() as client: + initialize, _ = client.request("initialize", {"adapterID": "cli-anything-lldb"}) + assert initialize["body"]["supportsConfigurationDoneRequest"] is True + client.read_event("initialized") + + client.request("launch", {"program": lldb_test_exe, "stopOnEntry": False}) + bps, _ = client.request("setFunctionBreakpoints", {"breakpoints": [{"name": "probe"}]}) + assert bps["body"]["breakpoints"] + client.request("configurationDone") + stopped = client.read_until_event({"stopped"}) + assert stopped["body"]["reason"] == "breakpoint" + + threads, _ = client.request("threads") + thread_id = threads["body"]["threads"][0]["id"] + stack, _ = client.request("stackTrace", {"threadId": thread_id, "levels": 10}) + frame = stack["body"]["stackFrames"][0] + assert frame["instructionPointerReference"].startswith("0x") + assert stack["body"]["totalFrames"] >= len(stack["body"]["stackFrames"]) + + scopes, _ = client.request("scopes", {"frameId": frame["id"]}) + variables_ref = scopes["body"]["scopes"][0]["variablesReference"] + variables, _ = client.request("variables", {"variablesReference": variables_ref}) + variables_by_name = {item["name"]: item for item in variables["body"]["variables"]} + names = set(variables_by_name) + assert {"a", "b"} <= names + + evaluated, _ = client.request("evaluate", {"expression": "a + b", "frameId": frame["id"]}) + assert evaluated["body"]["result"] in {"42", "0x2a"} + + source_path = frame.get("source", {}).get("path") + assert source_path + source, _ = client.request("source", {"source": {"path": source_path}}) + assert "GLOBAL_BUFFER" in source["body"]["content"] + + loaded_sources, _ = client.request("loadedSources") + loaded_paths = {Path(item["path"]).name for item in loaded_sources["body"]["sources"]} + assert "lldb_helper.c" in loaded_paths + + modules, _ = client.request("modules") + module_names = {item["name"] for item in modules["body"]["modules"]} + assert Path(lldb_test_exe).name in module_names + + exception_info, _ = client.request("exceptionInfo", {"threadId": thread_id}) + assert exception_info["body"]["exceptionId"] + + address_eval, _ = client.request( + "evaluate", + {"expression": "(char*)&GLOBAL_BUFFER[0]", "frameId": frame["id"]}, + ) + addr = _extract_address({"value": address_eval["body"]["result"]}) + memory, _ = client.request("readMemory", {"memoryReference": addr, "count": 32}) + raw = base64.b64decode(memory["body"]["data"]) + assert b"agent-native-lldb" in raw + + disassembly, _ = client.request( + "disassemble", + {"memoryReference": frame["instructionPointerReference"], "instructionCount": 4}, + ) + assert disassembly["body"]["instructions"] + + client.request("next", {"threadId": thread_id}) + step_stop = client.read_until_event({"stopped", "terminated"}) + assert step_stop["event"] in {"stopped", "terminated"} + + if step_stop["event"] == "stopped": + client.request("continue", {"threadId": thread_id}) + final_event = client.read_until_event({"exited", "terminated", "stopped"}) + assert final_event["event"] in {"exited", "terminated", "stopped"} + + def test_dap_stop_on_entry(self, lldb_test_exe: str): + with DAPClient() as client: + client.request("initialize", {"adapterID": "cli-anything-lldb"}) + client.read_event("initialized") + client.request("launch", {"program": lldb_test_exe, "stopOnEntry": True}) + client.request("configurationDone") + stopped = client.read_until_event({"stopped"}) + + assert stopped["body"]["reason"] == "entry" + + @skip_no_lldb class TestCoreE2E: def test_core_load_requires_target(self, session_file: Path, core_file: str): diff --git a/lldb/agent-harness/cli_anything/lldb/utils/session_client.py b/lldb/agent-harness/cli_anything/lldb/utils/session_client.py index 2b66c3c8f..2601ab05a 100644 --- a/lldb/agent-harness/cli_anything/lldb/utils/session_client.py +++ b/lldb/agent-harness/cli_anything/lldb/utils/session_client.py @@ -11,7 +11,6 @@ import struct import subprocess import sys -import tempfile import time from pathlib import Path from typing import Any @@ -19,6 +18,22 @@ MAX_MESSAGE_BYTES = 1024 * 1024 +def default_session_root() -> Path: + env_override = os.environ.get("CLI_ANYTHING_LLDB_SESSION_DIR") + if env_override: + return Path(env_override).expanduser().resolve() + + if os.name == "nt": + base = os.environ.get("LOCALAPPDATA") or os.environ.get("APPDATA") + root = Path(base).expanduser() if base else Path.home() / "AppData" / "Local" + return (root / "cli-anything-lldb" / "sessions").resolve() + + runtime_dir = os.environ.get("XDG_RUNTIME_DIR") + if runtime_dir: + return (Path(runtime_dir).expanduser() / "cli-anything-lldb").resolve() + return (Path.home() / ".cache" / "cli-anything-lldb" / "sessions").resolve() + + def resolve_session_file(explicit: str | None = None) -> Path: if explicit: return Path(explicit).expanduser().resolve() @@ -29,7 +44,7 @@ def resolve_session_file(explicit: str | None = None) -> Path: scope = os.environ.get("CLI_ANYTHING_LLDB_SESSION_SCOPE") or os.getcwd() digest = hashlib.sha256(os.path.abspath(scope).encode("utf-8")).hexdigest()[:12] - root = Path(tempfile.gettempdir()) / "cli-anything-lldb" + root = default_session_root() return (root / f"session-{digest}.json").resolve() diff --git a/lldb/agent-harness/cli_anything/lldb/utils/session_server.py b/lldb/agent-harness/cli_anything/lldb/utils/session_server.py index d46b984ca..6233e6f85 100644 --- a/lldb/agent-harness/cli_anything/lldb/utils/session_server.py +++ b/lldb/agent-harness/cli_anything/lldb/utils/session_server.py @@ -6,11 +6,13 @@ import argparse import base64 +import getpass import hmac import json import os import socket import struct +import subprocess import sys import time from pathlib import Path @@ -20,17 +22,95 @@ MAX_MESSAGE_BYTES = 1024 * 1024 +_ALLOWED_SESSION_METHODS = { + "target_create", + "target_info", + "attach_pid", + "attach_name", + "launch", + "detach", + "breakpoint_set", + "breakpoint_list", + "breakpoint_delete", + "breakpoint_enable", + "step_over", + "step_into", + "step_out", + "continue_exec", + "interrupt", + "interrupt_async", + "backtrace", + "locals", + "local_values", + "set_local_variable", + "set_child_value", + "evaluate", + "threads", + "thread_select", + "frame_select", + "frame_info", + "read_memory", + "find_memory", + "disassemble", + "loaded_sources", + "modules", + "load_core", + "process_info", +} + def _encode_token(token: bytes) -> str: return base64.b64encode(token).decode("ascii") -def _prepare_state_dir(state_dir: Path): - state_dir.mkdir(parents=True, exist_ok=True) +def _best_effort_chmod(path: Path, mode: int): + try: + os.chmod(path, mode) + except OSError: + pass + + +def _best_effort_restrict_windows_acl(path: Path): if os.name != "nt": + return + user = getpass.getuser() + try: + subprocess.run( + ["icacls", str(path), "/inheritance:r", "/grant:r", f"{user}:F"], + stdin=subprocess.DEVNULL, + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + check=False, + ) + except OSError: + pass + + +def _prepare_state_dir(state_dir: Path): + state_dir.mkdir(mode=0o700, parents=True, exist_ok=True) + _best_effort_chmod(state_dir, 0o700) + _best_effort_restrict_windows_acl(state_dir) + + +def _write_owner_only_json(path: Path, payload: dict[str, Any]): + tmp_path = path.with_name(f".{path.name}.{os.getpid()}.tmp") + flags = os.O_WRONLY | os.O_CREAT | os.O_EXCL + flags |= getattr(os, "O_BINARY", 0) + fd = os.open(tmp_path, flags, 0o600) + try: + with os.fdopen(fd, "w", encoding="utf-8") as handle: + json.dump(payload, handle) + handle.flush() + os.fsync(handle.fileno()) + _best_effort_chmod(tmp_path, 0o600) + _best_effort_restrict_windows_acl(tmp_path) + os.replace(tmp_path, path) + _best_effort_chmod(path, 0o600) + _best_effort_restrict_windows_acl(path) + finally: try: - os.chmod(state_dir, 0o700) - except OSError: + tmp_path.unlink() + except FileNotFoundError: pass @@ -42,21 +122,7 @@ def _write_state_file(state_file: Path, address: tuple[str, int], token: bytes): "token": _encode_token(token), "pid": os.getpid(), } - - flags = os.O_WRONLY | os.O_CREAT | os.O_TRUNC - flags |= getattr(os, "O_BINARY", 0) - if os.name != "nt": - flags |= getattr(os, "O_NOFOLLOW", 0) - - fd = os.open(str(state_file), flags, 0o600) - with os.fdopen(fd, "w", encoding="utf-8") as state_fp: - json.dump(payload, state_fp) - - if os.name != "nt": - try: - os.chmod(state_file, 0o600) - except OSError: - pass + _write_owner_only_json(state_file, payload) def _remove_state_file(state_file: Path): @@ -139,6 +205,8 @@ def handle(self, request: dict[str, Any]) -> tuple[dict[str, Any], bool]: self.close() try: + if method not in _ALLOWED_SESSION_METHODS: + raise RuntimeError(f"Unsupported session method: {method}") if self._session is None: self._session = LLDBSession() diff --git a/lldb/agent-harness/setup.py b/lldb/agent-harness/setup.py index bbf8c2232..82e8916bc 100644 --- a/lldb/agent-harness/setup.py +++ b/lldb/agent-harness/setup.py @@ -9,7 +9,7 @@ setup( name="cli-anything-lldb", - version="0.1.0", + version="1.0.0", description="CLI harness for LLDB debugger via Python API", long_description=_long_desc, long_description_content_type="text/markdown", @@ -26,6 +26,7 @@ entry_points={ "console_scripts": [ "cli-anything-lldb=cli_anything.lldb.lldb_cli:main", + "cli-anything-lldb-dap=cli_anything.lldb.dap:main", ], }, package_data={ diff --git a/registry.json b/registry.json index 7deef2280..ada6355f0 100644 --- a/registry.json +++ b/registry.json @@ -274,8 +274,8 @@ { "name": "lldb", "display_name": "LLDB", - "version": "0.1.0", - "description": "Stateful native debugging via the LLDB Python API with JSON-friendly inspection commands", + "version": "1.0.0", + "description": "Stateful native debugging via LLDB with JSON CLI workflows and a stdio Debug Adapter Protocol server", "requires": "LLDB installation with Python bindings available (for example LLVM.LLVM on Windows)", "homepage": "https://lldb.llvm.org", "source_url": null, diff --git a/skills/cli-anything-lldb/SKILL.md b/skills/cli-anything-lldb/SKILL.md index 37927290b..4141ea178 100644 --- a/skills/cli-anything-lldb/SKILL.md +++ b/skills/cli-anything-lldb/SKILL.md @@ -1,7 +1,7 @@ --- name: "cli-anything-lldb" description: Stateful LLDB debugging via LLDB Python API -version: 0.1.0 +version: 1.0.0 command: cli-anything-lldb install: pip install cli-anything-lldb requires: @@ -28,6 +28,7 @@ Use this CLI to run structured LLDB debugging workflows with JSON output. - Read/find process memory - Load core dumps - Interactive REPL with persistent session state +- Formal stdio Debug Adapter Protocol server for AI/editor clients ## Quick Commands @@ -35,13 +36,70 @@ Use this CLI to run structured LLDB debugging workflows with JSON output. cli-anything-lldb --json target create --exe /path/to/exe cli-anything-lldb --json process launch --arg foo --arg bar cli-anything-lldb --json breakpoint set --function main +cli-anything-lldb --json breakpoint set --function PluginEntry --allow-pending cli-anything-lldb --json process continue +cli-anything-lldb --json process interrupt cli-anything-lldb --json thread backtrace --limit 20 cli-anything-lldb --json frame locals cli-anything-lldb --json expr "myVar" cli-anything-lldb --json memory read --address 0x1000 --size 64 +cli-anything-lldb --json session close ``` +## Debug Adapter Protocol + +Use the DAP entry point when an AI client needs a real debug adapter lifecycle +instead of shelling out separate CLI commands: + +```bash +cli-anything-lldb-dap +cli-anything-lldb-dap --profile /path/to/stop-rules.json +``` + +or: + +```bash +cli-anything-lldb dap +cli-anything-lldb dap --profile /path/to/stop-rules.json +``` + +The DAP server speaks stdio `Content-Length` frames and must have exclusive +stdout. Do not print logs to stdout around it. Supported requests include +`initialize`, `launch`, `attach`, `configurationDone`, `setBreakpoints`, +`setFunctionBreakpoints`, `threads`, `stackTrace`, `scopes`, `variables`, +`setVariable`, `evaluate`, `continue`, `pause`, `next`, `stepIn`, `stepOut`, +`source`, `loadedSources`, `readMemory`, `modules`, `exceptionInfo`, +`disassemble`, and `disconnect`. + +DAP variables can expose child references for structs/classes/arrays. Use +`setVariable` only while stopped; LLDB may reject writes to optimized-out or +read-only values. + +For long-running GUI debuggees, DAP `continue` is non-blocking from the client's +point of view: the adapter sends the response and `continued` event first, then +waits for LLDB on a background thread. DAP `pause` uses LLDB async interrupt. +If an agent needs to change breakpoints while the debuggee is running, the +adapter interrupts first and waits for a stopped state before mutating LLDB +breakpoints; if the target does not stop in time, retry after an explicit +`pause`/`stopped` cycle. + +For GUI apps that stop on debugger-internal startup or shader-JIT breakpoints, +`launch` and `attach` accept the non-standard boolean argument +`autoContinueInternalBreakpoints`. Enable it only when those internal stops are +noise for the task; the adapter emits an `output` event before auto-continuing. +For target-specific noise, prefer structured stop rules through inline +`stopRules` or an external `stopRuleProfile`/`--profile` JSON file. Rules can +match by `reason`, `module`, `function`, and/or `regex`, then either `stop` with +clear `cliAnythingStop.origin` metadata or `continue` automatically. Use +profiles for apps such as C4D so their NVIDIA shader-JIT/startup traps live +outside the generic adapter. + +DAP `stopped` events include `body.cliAnythingStop.origin`: `manualPause` for a +client pause request, `internalTrap` for a matched internal rule, and `debuggee` +for ordinary program stops. Existing `cli-anything-lldb-dap` processes do not +hot-load new code or profile contents; restart the adapter and re-attach or +re-launch before expecting new rules to apply. + ## Command Groups ### target @@ -52,10 +110,11 @@ cli-anything-lldb --json target info ### process ```bash -cli-anything-lldb --json process launch [--arg ARG ...] [--env KEY=VALUE ...] [--cwd DIR] +cli-anything-lldb --json process launch [--arg ARG ...] [--env KEY=VALUE ...] [--cwd DIR] [--stop-at-entry] cli-anything-lldb --json process attach --pid 1234 cli-anything-lldb --json process attach --name myapp --wait-for cli-anything-lldb --json process continue +cli-anything-lldb --json process interrupt cli-anything-lldb --json process detach cli-anything-lldb --json process info ``` @@ -64,6 +123,7 @@ cli-anything-lldb --json process info ```bash cli-anything-lldb --json breakpoint set --function main cli-anything-lldb --json breakpoint set --file main.c --line 42 --condition "i > 10" +cli-anything-lldb --json breakpoint set --function LateLoadedSymbol --allow-pending cli-anything-lldb --json breakpoint list cli-anything-lldb --json breakpoint delete --id 1 cli-anything-lldb --json breakpoint enable --id 1 @@ -94,10 +154,15 @@ cli-anything-lldb --json core load --path /path/to/core ## Agent Usage Notes - Prefer `--json` for all automated flows. -- Non-REPL commands share state across separate invocations through the persistent session daemon until you run `session close` or the idle timeout expires. -- Use REPL when you want an interactive long-running debugger session: - - run `cli-anything-lldb` - - execute multi-step commands in one session +- Separate non-REPL invocations share a persistent session daemon by default. +- Use `--session-file PATH` or `CLI_ANYTHING_LLDB_SESSION_FILE` to pin an explicit session for a task. +- Run `cli-anything-lldb --json session close` when finished so attached processes detach and launched debuggees are cleaned up. +- Use REPL when a human-like interactive shell is more convenient, not because persistence requires it. +- Unresolved CLI breakpoints fail by default; pass `--allow-pending` only when a future module/symbol load is expected. +- DAP unresolved breakpoints use protocol semantics: `verified: false` until resolved. +- DAP `continue` is non-blocking for long-running GUI processes, and DAP `pause` uses async interrupt. +- DAP breakpoint changes during an active continue first interrupt and wait for a stopped state before mutating LLDB. +- Use DAP stop-rule profiles for app-specific internal traps; restart and re-attach/re-launch after profile changes. - `memory find` uses a chunked scan capped at 1 MiB per call. - Call `target create` before process or core commands. - Expect structured errors: `{"error": "...", "type": "..."}`