diff --git a/.dev/status/current-handoff.md b/.dev/status/current-handoff.md index 4acf9cf..07733df 100644 --- a/.dev/status/current-handoff.md +++ b/.dev/status/current-handoff.md @@ -1,7 +1,7 @@ # agent-memory current handoff Status: AI-authored draft. Not yet human-approved. -Last updated: 2026-05-01 11:10 KST +Last updated: 2026-05-01 11:37 KST ## Trigger for the next session @@ -16,7 +16,7 @@ read this file first. Do not ask the user to restate context. Verify repo state, ## Ready-to-say answer -agent-memory는 v0.1.39까지 배포/Hermes QA가 완료됐고, 현재는 Priority 5 dogfood/noise monitoring에서 v0.1.39 dogfood 결과를 바탕으로 `observations review-candidates`의 JSON 계약을 더 운영 친화적으로 다듬는 slice를 진행 중이야. 브랜치는 `feat/observation-review-temporal`, worktree는 `/Users/reddit/Project/agent-memory/.worktrees/observation-review-temporal`야. 목표는 review-candidates 결과에 top-level count, per-ref observation window, fact status-history summary를 추가해 historical injections와 현재 lifecycle 상태를 더 쉽게 구분하는 것이다. 자동 cleanup/mutation은 여전히 하지 않는다. +agent-memory는 v0.1.40까지 배포/Hermes QA가 완료됐고, 현재는 Priority 5 dogfood/noise monitoring에서 empty retrieval/high empty ratio 진단을 강화하는 read-only slice를 진행 중이야. 브랜치는 `feat/empty-retrieval-diagnostics`, worktree는 `/Users/reddit/Project/agent-memory/.worktrees/empty-retrieval-diagnostics`야. 목표는 `observations empty-diagnostics`를 추가해 empty-heavy observation을 surface/scope/status filter별로 묶고, scope mismatch나 승인된 memory coverage 부족을 사람이 안전하게 판단하게 하는 것이다. 자동 cleanup/mutation은 여전히 하지 않는다. ## Current repo state @@ -32,17 +32,17 @@ Expected GitHub identity: Verified before this slice: -- latest completed release: `v0.1.39` -- v0.1.39 added read-only `agent-memory observations review-candidates` and completed published smoke/Hermes runtime QA. -- local Hermes hook uses `/Users/reddit/.agent-memory/runtime/v0.1.39/.venv/bin/python -m agent_memory.api.cli hermes-pre-llm-hook ...` against `/Users/reddit/.agent-memory/memory.db`. +- latest completed release: `v0.1.40` +- v0.1.40 added observation windows/counts/status-history summaries to review-candidates and completed published smoke/Hermes runtime QA. +- local Hermes hook uses `/Users/reddit/.agent-memory/runtime/v0.1.40/.venv/bin/python -m agent_memory.api.cli hermes-pre-llm-hook ...` against `/Users/reddit/.agent-memory/memory.db`. - root checkout was clean on `main...origin/main` except local-only untracked state. - open PRs were `[]`. Active slice/worktree: -- branch: `feat/observation-review-temporal` -- worktree: `/Users/reddit/Project/agent-memory/.worktrees/observation-review-temporal` -- intended release after merge: likely `v0.1.40` +- branch: `feat/empty-retrieval-diagnostics` +- worktree: `/Users/reddit/Project/agent-memory/.worktrees/empty-retrieval-diagnostics` +- intended release after merge: likely `v0.1.41` Expected local untracked artifacts to preserve in the root checkout: @@ -54,63 +54,76 @@ Expected local untracked artifacts to preserve in the root checkout: Do not delete or commit these unless the user explicitly asks. -## Current slice: observation review temporal summaries +## Current slice: empty retrieval diagnostics Goal: - Keep dogfood/noise monitoring read-only. -- Make `observations review-candidates` easier to consume from local dogfood output. -- Add compact count/window/history summaries without exposing raw user queries and without mutating memory. +- Make high empty retrieval ratio actionable without storing or emitting raw user queries. +- Diagnose empty-heavy segments by surface, preferred scope, and retrieval status filter before changing rankers or adding graph traversal. Implemented so far in the active worktree: -- `observations audit` top refs now include `observation_window`: - - `first_observation_id` - - `first_observed_at` - - `latest_observation_id` - - `latest_observed_at` -- `observations review-candidates` now includes top-level: +- New CLI command: + - `agent-memory observations empty-diagnostics --limit 200 --top 10 --high-empty-threshold 0.5` +- Output contract: + - `kind: retrieval_empty_diagnostics` + - `read_only: true` - `observation_count` - - `candidate_count` -- Each review candidate now includes: - - the propagated `observation_window` - - `status_history_summary.transition_count` - - `status_history_summary.latest_transition` + - `empty_retrieval_count` + - `empty_retrieval_ratio` + - `quality_warnings` + - top-level `observation_window` + - `empty_by_surface[]` + - `empty_by_preferred_scope[]` + - `empty_by_status_filter[]` + - `suggested_next_steps` +- Segment entries include: + - segment key (`surface`, `preferred_scope`, or `statuses`) + - `total_count` + - `empty_count` + - `empty_ratio` + - `signals`, currently `high_empty_segment` when above threshold + - `sample_observation_ids` + - `observation_window` +- Secret-safety preserved: + - no raw query text + - no query previews + - no prompt content - Docs updated: - `README.md` - `docs/hermes-dogfood.md` - Tests updated in `tests/test_cli.py`: - - audit regression asserts per-ref observation window. - - review-candidates regression asserts top-level counts and status history summary. + - new regression asserts empty diagnostics segment grouping, read-only shape, next-step hints, and no secret leakage from raw query strings. Verification so far: - RED confirmed: - - focused tests failed on missing `observation_window` and top-level `observation_count`. + - focused test initially failed because `empty-diagnostics` parser choice was missing. - GREEN focused: - - `TMPDIR=$PWD/.tmp-test uv run pytest tests/test_cli.py::test_python_module_cli_observations_audit_reports_frequent_and_stale_refs_without_raw_queries tests/test_cli.py::test_python_module_cli_observations_review_candidates_explains_top_refs_without_mutation_or_raw_queries -q` - - `2 passed` + - `TMPDIR=$PWD/.tmp-test uv run pytest tests/test_cli.py::test_python_module_cli_observations_empty_diagnostics_groups_empty_segments_without_raw_queries -q` + - `1 passed` Remaining before PR: -1. Run broader/full local verification: - - focused CLI tests around audit/review-candidates +1. Run broader focused CLI tests around observations audit/review-candidates/empty-diagnostics. +2. Run full local verification: - `uv run pytest tests/ -q` - `uv run python scripts/check_release_metadata.py` - `uv run python scripts/smoke_release_readiness.py` - `npm pack --dry-run` - `git diff --check` - `node --check bin/agent-memory.js` -2. Run real local DB smoke for `observations review-candidates` and verify the new fields exist. -3. Run static diff secret scan. -4. Create PR, watch CI, merge, follow release-sync/publish/published smoke/Hermes QA. -5. After v0.1.40 install, repeat Hermes hook doctor and installed `observations review-candidates` against the existing local DB. +3. Run real local DB smoke for `observations empty-diagnostics` and verify no raw query fields appear. +4. Run static diff secret scan. +5. Create PR, watch CI, merge, follow release-sync/publish/published smoke/Hermes QA. +6. After v0.1.41 install, repeat Hermes hook doctor and installed `observations empty-diagnostics` against the existing local DB. ## Next natural slice after this one -After the review-candidates contract is released and dogfooded, continue Priority 5 by either: +After empty retrieval diagnostics are released and dogfooded, continue Priority 5 by either: -1. improving retrieval diagnostics for empty retrieval/high empty ratio, or -2. adding an explicit human review cadence/checklist around candidate reports. +1. adding an explicit human review cadence/checklist around audit/review-candidates/empty-diagnostics, or +2. improving candidate report UX further by bundling suggested follow-up commands into a richer read-only triage report. -Avoid automatic cleanup/deprecation until the review candidate workflow has been used on real local data for a while. +Avoid automatic cleanup/deprecation until the review and diagnostics workflow has been used on real local data for a while. diff --git a/README.md b/README.md index 0aedead..9a3dac2 100644 --- a/README.md +++ b/README.md @@ -109,10 +109,11 @@ For local dogfood and noise monitoring, retrievals can leave a secret-safe obser agent-memory retrieve "$DB" "How should I install agent-memory?" --preferred-scope user:default --observe cli agent-memory observations list "$DB" --limit 20 agent-memory observations audit "$DB" --limit 200 --top 10 --frequent-threshold 3 +agent-memory observations empty-diagnostics "$DB" --limit 200 --top 10 --high-empty-threshold 0.5 agent-memory observations review-candidates "$DB" --limit 200 --top 10 --frequent-threshold 3 ``` -Use the observation log and audit report to spot frequently injected or surprising memories before changing retrieval behavior. The audit output is read-only JSON with surface/scope counts, empty-retrieval count and ratio, quality warnings such as `low_observation_count` or `high_empty_retrieval_ratio`, top injected memory refs, current status for known refs, per-ref observation windows, and simple signals such as `frequently_injected` and `current_status_not_approved`. `observations review-candidates` is also read-only; it turns the top audit refs into forensic candidates with top-level `observation_count`/`candidate_count`, fact review explanations, status-history summaries, replacement-chain hints, graph-neighborhood summaries, and copy-paste follow-up commands such as `review explain`, `review replacements`, and `graph inspect`. Treat these reports as local operator telemetry, not a synced analytics feature or an automatic cleanup workflow. +Use the observation log and audit report to spot frequently injected or surprising memories before changing retrieval behavior. The audit output is read-only JSON with surface/scope counts, empty-retrieval count and ratio, quality warnings such as `low_observation_count` or `high_empty_retrieval_ratio`, top injected memory refs, current status for known refs, per-ref observation windows, and simple signals such as `frequently_injected` and `current_status_not_approved`. `observations empty-diagnostics` is read-only and focuses specifically on empty retrievals: it groups empty-heavy observations by surface, preferred scope, and status filter with segment ratios, sample observation ids, observation windows, and next-step hints for checking scope mismatches or missing approved memory coverage before changing rankers. `observations review-candidates` is also read-only; it turns the top audit refs into forensic candidates with top-level `observation_count`/`candidate_count`, fact review explanations, status-history summaries, replacement-chain hints, graph-neighborhood summaries, and copy-paste follow-up commands such as `review explain`, `review replacements`, and `graph inspect`. Treat these reports as local operator telemetry, not a synced analytics feature or an automatic cleanup workflow. ## Hermes quickstart diff --git a/docs/hermes-dogfood.md b/docs/hermes-dogfood.md index 0a3d948..92ddfc7 100644 --- a/docs/hermes-dogfood.md +++ b/docs/hermes-dogfood.md @@ -48,11 +48,14 @@ Hermes pre-LLM hook retrievals write a secret-safe local observation row to the ```bash agent-memory observations list ~/.agent-memory/memory.db --limit 20 agent-memory observations audit ~/.agent-memory/memory.db --limit 200 --top 10 --frequent-threshold 3 +agent-memory observations empty-diagnostics ~/.agent-memory/memory.db --limit 200 --top 10 --high-empty-threshold 0.5 agent-memory observations review-candidates ~/.agent-memory/memory.db --limit 200 --top 10 --frequent-threshold 3 ``` Use this before tuning ranking or adding broader graph traversal: first confirm which memories are frequently injected, which scopes are active, whether retrieval is often empty, and whether any frequently injected refs are now deprecated/disputed/missing. The audit command is read-only and summarizes local observation rows without emitting raw query text or query previews. Keep this data local unless you intentionally export it. +When `empty_retrieval_ratio` is high, run `observations empty-diagnostics` before changing rankers. It is a read-only, secret-safe segment report for empty observations. It groups empty-heavy rows by surface, preferred scope, and status filter; includes each segment's total count, empty count, empty ratio, sample observation ids, and observation window; and suggests operator checks such as scope mismatch review or adding/approving durable memories only after confirming the misses are real user needs. It does not emit raw query text, query previews, or prompt content. + `observations review-candidates` is the next read-only step after audit. It keeps the same secret-safe observation summary, then expands each top ref into a forensic candidate: - fact refs include the same lifecycle explanation as `agent-memory review explain fact ...`. diff --git a/src/agent_memory/api/cli.py b/src/agent_memory/api/cli.py index 5b69e90..003ebff 100644 --- a/src/agent_memory/api/cli.py +++ b/src/agent_memory/api/cli.py @@ -267,6 +267,140 @@ def _audit_retrieval_observations( } +def _observation_window(observations) -> dict[str, Any] | None: + if not observations: + return None + first = min(observations, key=lambda observation: observation.id) + latest = max(observations, key=lambda observation: observation.id) + return { + "first_observation_id": first.id, + "first_observed_at": first.created_at, + "latest_observation_id": latest.id, + "latest_observed_at": latest.created_at, + } + + +def _empty_diagnostic_segment_payload( + *, + segment_name: str, + segment_value: Any, + observations, + high_empty_threshold: float, +) -> dict[str, Any]: + empty_observations = [observation for observation in observations if not observation.retrieved_memory_refs] + total_count = len(observations) + empty_count = len(empty_observations) + empty_ratio = empty_count / total_count if total_count else 0.0 + signals = [] + if empty_ratio >= high_empty_threshold and empty_count > 0: + signals.append("high_empty_segment") + return { + segment_name: segment_value, + "total_count": total_count, + "empty_count": empty_count, + "empty_ratio": round(empty_ratio, 4), + "signals": signals, + "sample_observation_ids": [observation.id for observation in empty_observations[:5]], + "observation_window": _observation_window(observations), + } + + +def _empty_retrieval_diagnostics( + db_path: Path, + *, + limit: int, + top: int, + high_empty_threshold: float, +) -> dict[str, Any]: + if limit < 1: + raise ValueError("observations empty-diagnostics limit must be >= 1") + if top < 1: + raise ValueError("observations empty-diagnostics top must be >= 1") + if high_empty_threshold < 0 or high_empty_threshold > 1: + raise ValueError("observations empty-diagnostics high empty threshold must be between 0 and 1") + + observations = list_retrieval_observations(db_path, limit=limit) + empty_observations = [observation for observation in observations if not observation.retrieved_memory_refs] + empty_retrieval_ratio = len(empty_observations) / len(observations) if observations else 0.0 + + observations_by_surface: dict[str, list[Any]] = defaultdict(list) + observations_by_scope: dict[str | None, list[Any]] = defaultdict(list) + observations_by_statuses: dict[tuple[str, ...], list[Any]] = defaultdict(list) + for observation in observations: + observations_by_surface[observation.surface].append(observation) + observations_by_scope[observation.preferred_scope].append(observation) + observations_by_statuses[tuple(observation.statuses)].append(observation) + + def sort_segments(items): + return sorted( + items, + key=lambda item: (-item["empty_count"], -item["empty_ratio"], str(next(iter(item.values())))), + )[:top] + + empty_by_surface = sort_segments( + [ + _empty_diagnostic_segment_payload( + segment_name="surface", + segment_value=surface, + observations=segment_observations, + high_empty_threshold=high_empty_threshold, + ) + for surface, segment_observations in observations_by_surface.items() + ] + ) + empty_by_preferred_scope = sort_segments( + [ + _empty_diagnostic_segment_payload( + segment_name="preferred_scope", + segment_value=preferred_scope, + observations=segment_observations, + high_empty_threshold=high_empty_threshold, + ) + for preferred_scope, segment_observations in observations_by_scope.items() + ] + ) + empty_by_status_filter = sort_segments( + [ + _empty_diagnostic_segment_payload( + segment_name="statuses", + segment_value=list(statuses), + observations=segment_observations, + high_empty_threshold=high_empty_threshold, + ) + for statuses, segment_observations in observations_by_statuses.items() + ] + ) + + quality_warnings = [] + if not observations: + quality_warnings.append("no_observations") + if 0 < len(observations) < 10: + quality_warnings.append("low_observation_count") + if empty_retrieval_ratio >= high_empty_threshold and observations: + quality_warnings.append("high_empty_retrieval_ratio") + + return { + "kind": "retrieval_empty_diagnostics", + "read_only": True, + "observation_count": len(observations), + "limit": limit, + "top": top, + "high_empty_threshold": high_empty_threshold, + "empty_retrieval_count": len(empty_observations), + "empty_retrieval_ratio": round(empty_retrieval_ratio, 4), + "quality_warnings": quality_warnings, + "observation_window": _observation_window(observations), + "empty_by_surface": empty_by_surface, + "empty_by_preferred_scope": empty_by_preferred_scope, + "empty_by_status_filter": empty_by_status_filter, + "suggested_next_steps": [ + "Run observations audit to compare empty vs non-empty retrieval surfaces.", + "Check preferred scope values for scope mismatches before changing ranking.", + "Add or approve memories only after confirming the missing queries represent durable user needs.", + ], + } + + def _review_candidates_from_observations( db_path: Path, *, @@ -654,6 +788,14 @@ def _build_parser() -> argparse.ArgumentParser: observations_audit_parser.add_argument("--limit", type=int, default=200) observations_audit_parser.add_argument("--top", type=int, default=10) observations_audit_parser.add_argument("--frequent-threshold", type=int, default=3) + observations_empty_diagnostics_parser = observations_subparsers.add_parser( + "empty-diagnostics", + help="Build a read-only diagnostic report for empty retrieval observations.", + ) + observations_empty_diagnostics_parser.add_argument("db_path", type=Path) + observations_empty_diagnostics_parser.add_argument("--limit", type=int, default=200) + observations_empty_diagnostics_parser.add_argument("--top", type=int, default=10) + observations_empty_diagnostics_parser.add_argument("--high-empty-threshold", type=float, default=0.5) observations_review_candidates_parser = observations_subparsers.add_parser( "review-candidates", help="Build a read-only forensic review report from top retrieval observation refs.", @@ -1059,6 +1201,19 @@ def main() -> None: ) ) return + if args.observations_action == "empty-diagnostics": + print( + json.dumps( + _empty_retrieval_diagnostics( + args.db_path, + limit=args.limit, + top=args.top, + high_empty_threshold=args.high_empty_threshold, + ), + indent=2, + ) + ) + return if args.observations_action == "review-candidates": print( json.dumps( diff --git a/tests/test_cli.py b/tests/test_cli.py index 329a8bc..8a9401b 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -385,6 +385,122 @@ def test_python_module_cli_observations_review_candidates_explains_top_refs_with assert "abc123" not in review_result.stdout +def test_python_module_cli_observations_empty_diagnostics_groups_empty_segments_without_raw_queries( + tmp_path: Path, +) -> None: + db_path = tmp_path / "observation-empty-diagnostics.db" + initialize_database(db_path) + + env = {**os.environ, "PYTHONPATH": "src"} + for secret_query in ( + "no matching alpha sensitive marker SUPERSECRET", + "no matching beta sensitive marker ABC123", + ): + retrieve_result = subprocess.run( + [ + sys.executable, + "-m", + "agent_memory.api.cli", + "retrieve", + str(db_path), + secret_query, + "--preferred-scope", + "project:missing-scope", + "--observe", + "cli-test", + ], + cwd=Path(__file__).resolve().parents[1], + env=env, + capture_output=True, + text=True, + ) + assert retrieve_result.returncode == 0, retrieve_result.stderr + + source = ingest_source_text( + db_path=db_path, + source_type="transcript", + content="Empty diagnostics hit target phrase is EMPTY_DIAG_OK.", + metadata={"project": "empty-diagnostics"}, + ) + fact = create_candidate_fact( + db_path=db_path, + subject_ref="Empty diagnostics", + predicate="target_phrase", + object_ref_or_value="EMPTY_DIAG_OK", + evidence_ids=[source.id], + scope="project:empty-diagnostics", + confidence=0.95, + ) + approve_fact(db_path=db_path, fact_id=fact.id) + + hit_result = subprocess.run( + [ + sys.executable, + "-m", + "agent_memory.api.cli", + "retrieve", + str(db_path), + "What is the empty diagnostics target phrase?", + "--preferred-scope", + "project:empty-diagnostics", + "--observe", + "cli-test", + ], + cwd=Path(__file__).resolve().parents[1], + env=env, + capture_output=True, + text=True, + ) + assert hit_result.returncode == 0, hit_result.stderr + + diagnostics_result = subprocess.run( + [ + sys.executable, + "-m", + "agent_memory.api.cli", + "observations", + "empty-diagnostics", + str(db_path), + "--limit", + "20", + "--top", + "5", + "--high-empty-threshold", + "0.5", + ], + cwd=Path(__file__).resolve().parents[1], + env=env, + capture_output=True, + text=True, + ) + + assert diagnostics_result.returncode == 0, diagnostics_result.stderr + payload = json.loads(diagnostics_result.stdout) + assert payload["kind"] == "retrieval_empty_diagnostics" + assert payload["read_only"] is True + assert payload["observation_count"] == 3 + assert payload["empty_retrieval_count"] == 2 + assert payload["empty_retrieval_ratio"] == 0.6667 + assert payload["empty_by_surface"][0]["surface"] == "cli-test" + assert payload["empty_by_surface"][0]["empty_count"] == 2 + scope_segment = payload["empty_by_preferred_scope"][0] + assert scope_segment["preferred_scope"] == "project:missing-scope" + assert scope_segment["empty_count"] == 2 + assert scope_segment["total_count"] == 2 + assert scope_segment["empty_ratio"] == 1.0 + assert scope_segment["signals"] == ["high_empty_segment"] + assert scope_segment["sample_observation_ids"] + assert scope_segment["observation_window"]["first_observation_id"] <= scope_segment["observation_window"]["latest_observation_id"] + assert payload["suggested_next_steps"] == [ + "Run observations audit to compare empty vs non-empty retrieval surfaces.", + "Check preferred scope values for scope mismatches before changing ranking.", + "Add or approve memories only after confirming the missing queries represent durable user needs.", + ] + assert "SUPERSECRET" not in diagnostics_result.stdout + assert "ABC123" not in diagnostics_result.stdout + + + def test_python_module_cli_observations_audit_reports_low_signal_empty_retrievals(tmp_path: Path) -> None: db_path = tmp_path / "observation-audit-empty.db" initialize_database(db_path) diff --git a/tests/test_hermes_adapter.py b/tests/test_hermes_adapter.py index 443cce7..2c35594 100644 --- a/tests/test_hermes_adapter.py +++ b/tests/test_hermes_adapter.py @@ -301,9 +301,13 @@ def test_build_hermes_adapter_payload_includes_alternative_memories_for_top_n_co ) adapter_payload = build_hermes_adapter_payload(packet, top_k=3) + expected_memory_ids = [trace.memory_id for trace in packet.retrieval_trace[:3]] + assert len(expected_memory_ids) == 3 + assert set(expected_memory_ids) == {branch_fact.id, owner_fact.id, deploy_fact.id} + assert adapter_payload.top_memory.model_dump() == { "memory_type": "fact", - "memory_id": 1, + "memory_id": expected_memory_ids[0], "label": "Project Multi", "trust_band": "high", "has_hidden_alternatives": False, @@ -311,14 +315,14 @@ def test_build_hermes_adapter_payload_includes_alternative_memories_for_top_n_co assert [memory.model_dump() for memory in adapter_payload.alternative_memories] == [ { "memory_type": "fact", - "memory_id": 2, + "memory_id": expected_memory_ids[1], "label": "Project Multi", "trust_band": "high", "has_hidden_alternatives": False, }, { "memory_type": "fact", - "memory_id": 3, + "memory_id": expected_memory_ids[2], "label": "Project Multi", "trust_band": "high", "has_hidden_alternatives": False, @@ -327,15 +331,15 @@ def test_build_hermes_adapter_payload_includes_alternative_memories_for_top_n_co assert render_hermes_prompt_lines(adapter_payload) == [ "Memory response mode: direct", "Prompt prefix: Answer directly using the top-ranked memory.", - "Top memory: fact #1 (Project Multi), trust=high, hidden_alternatives=no", - "Alternative memory: fact #2 (Project Multi), trust=high, hidden_alternatives=no", - "Alternative memory: fact #3 (Project Multi), trust=high, hidden_alternatives=no", - "Guideline: Use fact #1 (Project Multi) as the primary memory for the answer.", + f"Top memory: fact #{expected_memory_ids[0]} (Project Multi), trust=high, hidden_alternatives=no", + f"Alternative memory: fact #{expected_memory_ids[1]} (Project Multi), trust=high, hidden_alternatives=no", + f"Alternative memory: fact #{expected_memory_ids[2]} (Project Multi), trust=high, hidden_alternatives=no", + f"Guideline: Use fact #{expected_memory_ids[0]} (Project Multi) as the primary memory for the answer.", "Guideline: Answer directly; no uncertainty qualifier is required.", "Reason codes: top_ranked_memory, no_hidden_alternatives_detected", ] assert render_hermes_prompt_text(adapter_payload).splitlines()[3] == ( - "Alternative memory: fact #2 (Project Multi), trust=high, hidden_alternatives=no" + f"Alternative memory: fact #{expected_memory_ids[1]} (Project Multi), trust=high, hidden_alternatives=no" )