Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 41 additions & 29 deletions .dev/status/current-handoff.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# agent-memory current handoff

Status: AI-authored draft. Not yet human-approved.
Last updated: 2026-04-30 17:12 KST
Last updated: 2026-04-30 21:49 KST

## Trigger for the next session

Expand All @@ -16,11 +16,9 @@ read this file first. Do not ask the user to restate context. Verify repo state,

## Ready-to-say answer

지금 agent-memory는 OSS 기본 메모리 레이어 신뢰도 작업 Priority 1~4의 주요 truth lifecycle 조각을 v0.1.30까지 완료했고, 다음 안정화 slice로 retrieval-eval determinism hardening을 진행 중이야.
지금 agent-memory는 OSS 기본 메모리 레이어 신뢰도 작업 Priority 1~4의 주요 truth lifecycle 조각과 retrieval-eval read-only hardening을 v0.1.31까지 완료했고, 현재 slice는 protected `main` 때문에 반복되던 release metadata sync 수동 절차를 자동 fallback PR 흐름으로 줄이는 작업이야.

최신 검증 완료 릴리스는 v0.1.30이야. v0.1.27에서 status transition history, v0.1.28에서 npm wrapper stdin forwarding과 published Hermes hook smoke, v0.1.29에서 fact supersession/replacement relation, v0.1.30에서 `agent-memory review explain fact ...` decision explanation UX가 들어갔어. 로컬 Hermes hook도 v0.1.30 runtime으로 업데이트되어 doctor/hook smoke가 통과한 상태야.

현재 slice는 retrieval evaluation이 실제 retrieval path를 쓰되 eval 실행 자체가 `retrieval_count`, `reinforcement_count`, `last_accessed_at`를 mutate하지 않게 만드는 determinism hardening이야. 목적은 fixture 순서나 반복 실행이 이후 ranking 결과를 흔들지 않게 하는 것.
최신 검증 완료 릴리스는 v0.1.31이야. v0.1.27에서 status transition history, v0.1.28에서 npm wrapper stdin forwarding과 published Hermes hook smoke, v0.1.29에서 fact supersession/replacement relation, v0.1.30에서 `agent-memory review explain fact ...` decision explanation UX, v0.1.31에서 retrieval eval read-only behavior가 들어갔어. 로컬 Hermes hook도 v0.1.31 runtime으로 업데이트되어 doctor/hook smoke가 통과한 상태야.

## Current repo state

Expand All @@ -37,17 +35,17 @@ Expected GitHub identity:
Verified base before this slice:

- branch: `main`
- HEAD: `5011d99 chore: release v0.1.30 [skip release] (#28)`
- tag/release: `v0.1.30`
- GitHub Release: `https://github.com/cafitac/agent-memory/releases/tag/v0.1.30`
- npm: `@cafitac/agent-memory@0.1.30`
- PyPI: `cafitac-agent-memory==0.1.30`
- v0.1.30 published smoke artifact: passed; includes npm/uvx/pipx Hermes hook commands.
- HEAD: `6d955bb chore: release v0.1.31 [skip release] (#30)`
- tag/release: `v0.1.31`
- GitHub Release: `https://github.com/cafitac/agent-memory/releases/tag/v0.1.31`
- npm: `@cafitac/agent-memory@0.1.31`
- PyPI: `cafitac-agent-memory==0.1.31`
- v0.1.31 published smoke artifact: passed after a propagation retry; includes npm/uvx/pipx Hermes hook commands.

Active slice/worktree:

- branch: `fix/retrieval-eval-deterministic-ordering`
- worktree: `/Users/reddit/Project/agent-memory/.worktrees/retrieval-eval-deterministic-ordering`
- branch: `ci/auto-release-sync-pr`
- worktree: `/Users/reddit/Project/agent-memory/.worktrees/auto-release-sync-pr`

Expected local untracked artifacts to preserve in the root checkout:

Expand All @@ -59,26 +57,26 @@ Expected local untracked artifacts to preserve in the root checkout:

Do not delete or commit these unless the user explicitly asks.

## What is complete through v0.1.30
## What is complete through v0.1.31

### Distribution and release automation

- npm package and PyPI package are published from the same versioned source.
- npm-first user install path is documented and verified.
- main merge auto-release is active but protected `main` can block release metadata write-back; if that happens, use release-sync PR + tag push.
- Publish workflow gates GitHub Release creation on `published-install-smoke` after npm/PyPI publish.
- Published smoke uploads `published-install-smoke-result` JSON artifact with success/failure diagnostics.
- v0.1.28+ smoke covers npm/npx/npm-exec/uvx/pipx and Hermes hook stdin payload handling.
- Known repeated pain point before this slice: protected `main` blocked auto-release metadata write-back, requiring manual release-sync PR + manual tag push.

### Runtime adapter readiness

- Hermes bootstrap/doctor/install flow exists and defaults to the conservative preset.
- This local Hermes setup has agent-memory enabled via `/Users/reddit/.agent-memory/runtime/v0.1.30/.venv/bin/agent-memory` against `/Users/reddit/.agent-memory/memory.db`.
- This local Hermes setup has agent-memory enabled via `/Users/reddit/.agent-memory/runtime/v0.1.31/.venv/bin/agent-memory` against `/Users/reddit/.agent-memory/memory.db`.
- Hermes hook fails closed: unavailable DB/schema returns `{}` and exit 0 instead of breaking prompt flow.
- Conservative preset remains default: small prompt budgets, one top memory, no alternative-memory detail, no reason-code noise.
- `--preset balanced` is explicit opt-in for more context/noise.

### Truth lifecycle readiness
### Truth lifecycle and eval readiness

- Normal retrieval is approved-only by default.
- Candidate/disputed/deprecated facts remain available only behind explicit forensic/review surfaces.
Expand All @@ -89,28 +87,35 @@ Do not delete or commit these unless the user explicitly asks.
- Superseding a fact deprecates the old fact and approves the replacement fact, preserving reason/actor/evidence in transition history.
- `agent-memory review replacements fact ...` exposes replacement chains.
- `agent-memory review explain fact ...` explains status, default retrieval visibility, same claim-slot alternatives, replacement chain, and review follow-up commands.
- Retrieval eval calls the real retrieval path but suppresses retrieval bookkeeping writes (`retrieval_count`, `reinforcement_count`, `last_accessed_at`).

## Current slice: retrieval-eval deterministic ordering hardening
## Current slice: protected-main release fallback automation

Planned behavior:

- `evaluate_retrieval_fixtures(...)` continues to call the real `retrieve_memory_packet(...)` path.
- Evaluation runs suppress retrieval bookkeeping writes so repeated evals and fixture order cannot change future ranking via reinforcement state.
- Normal runtime retrieval still records approved memory retrievals by default.
- Main merge auto-release still tries the direct metadata write-back first.
- If `git push origin HEAD:main` is rejected by GitHub rules/protected `main`, auto-release should not fail the whole release path immediately.
- It should create a `release-sync/vX.Y.Z` branch from the already-bumped commit and open a PR titled `chore: release vX.Y.Z [skip release]`.
- The direct publish dispatch should run only when direct main push/tag push succeeds.
- After the release-sync PR is merged, a separate auto-release job should recognize the `[skip release]` release-sync commit, create/push the missing annotated tag, and dispatch `publish.yml`.
- If the tag already exists, the release-sync follow-up job should no-op rather than republishing.

Implementation direction:

- Add/keep a default-on `record_retrievals` option on `retrieve_memory_packet(...)`.
- Call `retrieve_memory_packet(..., record_retrievals=False)` from `core/retrieval_eval.py`.
- Test that eval does not mutate `retrieval_count`, `reinforcement_count`, or `last_accessed_at`.
- Update `.github/workflows/auto-release.yml` permissions to include `pull-requests: write`.
- Add `id: push_release` and a protected-main rejection branch around the direct push step.
- Add a `gh pr create` fallback step guarded by `steps.push_release.outputs.release_sync_required == 'true'`.
- Add a `tag-and-publish-release-sync` job for merged `chore: release v... [skip release]` commits.
- Keep `[skip release]` as the anti-recursion marker.
- Keep publish creation inside `publish.yml`; auto-release should only dispatch it.

## Verification checklist for this slice

Run from the active worktree:

```bash
uv run pytest tests/test_retrieval_evaluation.py::test_evaluate_retrieval_fixtures_does_not_mutate_retrieval_counters -q
uv run pytest tests/test_retrieval_evaluation.py tests/test_retrieval_trace.py tests/test_hermes_adapter.py -q
uv run pytest tests/test_release_workflows.py -q
uv run pytest tests/test_published_install_smoke.py -q
uv run pytest tests/ -q
uv run python scripts/check_release_metadata.py
uv run python scripts/smoke_release_readiness.py
Expand All @@ -123,13 +128,20 @@ Before PR, run a static diff secret scan and confirm finding_count 0.

## PR/release notes

This slice should be a patch release candidate, likely v0.1.31 after PR merge. If protected `main` blocks auto-release write-back again, use the existing release-sync PR + tag push workaround.
This slice changes only release automation/docs/tests, but it affects the release path and should be treated as a patch release candidate, likely v0.1.32 after PR merge.

Expected live verification after merge:

After release, verify GitHub Release/npm/PyPI/published-install-smoke and update the local Hermes runtime only if the release contains runtime-relevant package changes. This slice changes retrieval/eval behavior in the Python package, so a v0.1.31 runtime update is still preferred for dogfood parity.
1. The auto-release run for the PR merge should bump metadata to v0.1.32.
2. If protected `main` still rejects direct write-back, the run should open `release-sync/v0.1.32` PR automatically.
3. Merge that PR.
4. Confirm the release-sync follow-up job creates tag `v0.1.32`, dispatches publish, and published smoke passes.
5. Verify GitHub Release/npm/PyPI/published-install-smoke artifact.
6. Update local Hermes runtime to v0.1.32 only after package release is verified.

## Next likely slices after this

1. Release workflow protected-main automation/fallback improvement.
1. Published smoke propagation handling improvement: make first-run simple-index lag less noisy.
2. Actual Hermes dogfood observations and noise/latency notes.
3. Graph foundation read-only slice: graph inspection CLI or bounded relation traversal eval fixtures.
4. PyPI Trusted Publisher later; user deferred it.
106 changes: 104 additions & 2 deletions .github/workflows/auto-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ on:
permissions:
contents: write
actions: write
pull-requests: write

concurrency:
group: auto-release-main
Expand Down Expand Up @@ -76,16 +77,117 @@ jobs:
git tag -a "${{ steps.bump.outputs.tag }}" -m "Release ${{ steps.bump.outputs.tag }}"

- name: Push release commit and tag
id: push_release
run: |
set -euo pipefail
git push origin HEAD:main
TAG="${{ steps.bump.outputs.tag }}"
git push origin "$TAG"
PUSH_LOG="$(mktemp)"
if git push origin HEAD:main 2>"$PUSH_LOG"; then
git push origin "$TAG"
echo "release_sync_required=false" >> "$GITHUB_OUTPUT"
elif grep -q "GH013\|Changes must be made through a pull request" "$PUSH_LOG"; then
cat "$PUSH_LOG" >&2
echo "Protected main rejected the release metadata write-back; opening a release-sync PR instead." >&2
echo "release_sync_required=true" >> "$GITHUB_OUTPUT"
else
cat "$PUSH_LOG" >&2
exit 1
fi

- name: Create release sync pull request after protected main rejection
if: steps.push_release.outputs.release_sync_required == 'true'
env:
GH_TOKEN: ${{ github.token }}
run: |
set -euo pipefail
RELEASE_SYNC_BRANCH="release-sync/${{ steps.bump.outputs.tag }}"
git push origin "HEAD:${RELEASE_SYNC_BRANCH}"
cat > /tmp/release-sync-pr.md <<'EOF'
## Summary
- sync release metadata after protected main rejected the auto-release write-back
- keep the release commit marked with `[skip release]` so it cannot recursively bump another patch version

## Next automation
Publish workflow will run after the release sync PR is merged and the tag is pushed.
EOF
gh pr create \
--repo "${{ github.repository }}" \
--base main \
--head "${RELEASE_SYNC_BRANCH}" \
--title "chore: release ${{ steps.bump.outputs.tag }} [skip release]" \
--body-file /tmp/release-sync-pr.md

- name: Dispatch publish workflow for bot-created tag
if: steps.push_release.outputs.release_sync_required == 'false'
env:
GH_TOKEN: ${{ github.token }}
run: |
set -euo pipefail
TAG="${{ steps.bump.outputs.tag }}"
gh workflow run publish.yml --ref "$TAG" -f publish_pypi=true -f publish_npm=true

tag-and-publish-release-sync:
if: >-
github.ref == 'refs/heads/main' &&
contains(github.event.head_commit.message, '[skip release]') &&
startsWith(github.event.head_commit.message, 'chore: release v')
runs-on: ubuntu-latest
steps:
- name: Checkout main with full history
uses: actions/checkout@v5
with:
fetch-depth: 0
persist-credentials: true

- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.11'

- name: Set up uv
uses: astral-sh/setup-uv@v8.1.0

- name: Configure release bot identity
run: |
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"

- name: Resolve release sync version
id: release_sync
run: |
set -euo pipefail
uv run python scripts/check_release_metadata.py
VERSION=$(uv run python - <<'PY'
from agent_memory.release_metadata import validate_release_metadata
print(validate_release_metadata().python_package_version)
PY
)
TAG="v${VERSION}"
if ! grep -q "${TAG}" <<<"${{ github.event.head_commit.message }}"; then
echo "Release sync commit message does not match metadata tag ${TAG}" >&2
exit 1
fi
if git ls-remote --exit-code --tags origin "refs/tags/${TAG}" >/dev/null 2>&1; then
echo "Tag ${TAG} already exists; nothing to publish."
echo "tag_exists=true" >> "$GITHUB_OUTPUT"
else
echo "tag_exists=false" >> "$GITHUB_OUTPUT"
fi
echo "tag=${TAG}" >> "$GITHUB_OUTPUT"

- name: Push release sync tag
if: steps.release_sync.outputs.tag_exists == 'false'
run: |
set -euo pipefail
TAG="${{ steps.release_sync.outputs.tag }}"
git tag -a "$TAG" -m "Release $TAG"
git push origin "$TAG"

- name: Dispatch publish workflow for release sync tag
if: steps.release_sync.outputs.tag_exists == 'false'
env:
GH_TOKEN: ${{ github.token }}
run: |
set -euo pipefail
TAG="${{ steps.release_sync.outputs.tag }}"
gh workflow run publish.yml --ref "$TAG" -f publish_pypi=true -f publish_npm=true
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,8 @@ npm pack --dry-run

After a release publishes, the `published-install-smoke` workflow verifies the exact npm/PyPI version through npm registry lookup, `npx`, `npm exec`, `uvx`, and `pipx`. Maintainers can also run it manually with `gh workflow run published-install-smoke.yml -f version=<version>`.

Release automation expects protected `main`: if the auto-release workflow cannot push its bumped metadata commit directly, it opens a `release-sync/vX.Y.Z` PR instead. After that PR is merged, the same workflow tags the synced version and dispatches `publish.yml`, keeping the release path automated without requiring a permanent branch-protection bypass.

Useful source-checkout commands:

```bash
Expand Down
13 changes: 13 additions & 0 deletions tests/test_release_workflows.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,19 @@ def test_auto_release_workflow_bumps_versions_on_main_merges() -> None:
assert "--ref \"$TAG\"" in workflow


def test_auto_release_workflow_falls_back_to_release_sync_pr_when_main_is_protected() -> None:
workflow = (PROJECT_ROOT / ".github" / "workflows" / "auto-release.yml").read_text()

assert "pull-requests: write" in workflow
assert "Create release sync pull request after protected main rejection" in workflow
assert "release-sync/${{ steps.bump.outputs.tag }}" in workflow
assert "git push origin \"HEAD:${RELEASE_SYNC_BRANCH}\"" in workflow
assert "gh pr create" in workflow
assert "chore: release ${{ steps.bump.outputs.tag }} [skip release]" in workflow
assert "steps.push_release.outputs.release_sync_required == 'true'" in workflow
assert "Publish workflow will run after the release sync PR is merged and the tag is pushed." in workflow


def test_publish_workflow_remains_tag_driven_only() -> None:
workflow = (PROJECT_ROOT / ".github" / "workflows" / "publish.yml").read_text()

Expand Down