Skip to content

feat(retention): OSS 5-day community floor + prod-compose LOG_* fix (#1039 — OSS floor)#1065

Open
dolho wants to merge 5 commits into
devfrom
feature/1039-retention-oss-floor
Open

feat(retention): OSS 5-day community floor + prod-compose LOG_* fix (#1039 — OSS floor)#1065
dolho wants to merge 5 commits into
devfrom
feature/1039-retention-oss-floor

Conversation

@dolho

@dolho dolho commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Summary

Stage 1 of Abilityai/trinity-enterprise#31 — the OSS floor (the issue's "must land first" chunk). The enterprise retention module (private submodule) + Settings → Retention UI follow as separate PRs on the #847 seam.

Prod bug fixed (silent no-op)

docker-compose.prod.yml omitted LOG_RETENTION_DAYS / LOG_ARCHIVE_ENABLED / LOG_CLEANUP_HOUR from backend.environment:. Prod launches standalone (-f docker-compose.prod.yml, no base-compose merge), so an operator-set LOG_RETENTION_DAYS never reached the container and retention always fell back to the code default. Added the three lines.

5-day community floor

Lowered the operator-tunable defaults to the 5-day community floor:

Setting Was Now
LOG_RETENTION_DAYS 90 5
execution_log_retention_days 30 5
execution_row_retention_days 90 5
health_check_retention_days 7 5
agent_soft_delete_retention_days 180 5
schedule_soft_delete_retention_days 30 5
audit_log_retention_days 365 365 (exempt — integrity floor)

New COMMUNITY_RETENTION_FLOOR_DAYS / RETENTION_OPS_KEYS constants — the enterprise module reuses these to clamp unentitled writes.

Read surface

GET /api/settings/retention (admin) reports the effective windows in use + the active edition (community vs enterprise via the retention entitlement) + documented precedence: enterprise (license) DB setting → env → 5-day community default.

Design note

OSS does not hard-clamp env/OPS values — they remain an unsupported self-host escape hatch (per the issue: "not a cryptographic lock on a constant"). The clamp-to-floor lives in the enterprise retention module's managed setter (Stage 2).

⚠️ Release-notes call-out

This sharply shortens soft-delete recovery on community installs (agent 180→5, schedule 30→5). Deliberate per the issue; an enterprise license restores longer windows.

Verification

  • tests/unit/test_retention_floor.py7/7 (floor defaults, audit exemption, read-surface edition + windows, audit 365-floor, enterprise-set OPS window).
  • Cleanup/retention suites green (test_cleanup_inner_sweeps, test_execution_retention_prune, test_audit_retention_prune, test_agent_cleanup_parity).
  • Live: GET /api/settings/retentionedition: community, OPS windows 5, audit 365; backend recreated with the new compose → LOG_RETENTION_DAYS=5 flows through to the container.

Remaining Abilityai/trinity-enterprise#31 (follow-up PRs)

  • Enterprise retention module (private trinity-enterprise): enterprise_retention_config table on the two-track runner, GET/PUT /api/enterprise/retention/* gated by requires_entitlement("retention"), live-read write-through, clamp-to-floor when unentitled.
  • Frontend: Settings → Retention panel gated on the retention entitlement; community shows the fixed 5-day default + upgrade hint.

Related to Abilityai/trinity-enterprise#31

🤖 Generated with Claude Code

@dolho

dolho commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

Now spans all OSS-side stages of Abilityai/trinity-enterprise#31 (Stages 1 + 3); enterprise is trinity-enterprise#4

Per the plan to deliver Abilityai/trinity-enterprise#31 end-to-end, this PR now carries both public-repo stages; the enterprise backend lives in its own repo PR.

✅ Stage 1 — OSS floor (this PR)

Prod-compose LOG_* fix · 5-day floor across all 6 operator-tunable windows (audit exempt, 365) · GET /api/settings/retention read surface (+ edition + precedence) · COMMUNITY_RETENTION_FLOOR_DAYS/RETENTION_OPS_KEYS constants. 7/7 unit tests, live-verified.

✅ Stage 3 — frontend (this PR)

Settings → Retention tab (admin). Reads /api/settings/retention (all editions); community = read-only 5-day floor + upgrade hint; enterprise = editable per-class windows → PUT /api/enterprise/retention/config, applied live. Audit window shown, never editable. Gated via the existing enterprise store (isEntitled('retention')). SFC compiles clean (Vite HMR).

✅ Stage 2 — enterprise module → Abilityai/trinity-enterprise#4

Private enterprise_retention_config + GET/PUT /api/enterprise/retention/config (double-gated) + live-read write-through to OSS system_settings (no recreate) + clamp-to-floor. 5/5 module tests, enterprise suite 24/24.

Live end-to-end (full stack, submodule mounted)

  • enterprise_features = ['audit','retention','siem'] → read surface edition: enterprise.
  • PUT {execution_row:90, execution_log:2} → 90 allowed, 2 clamped to 5, and GET /api/settings/retention reflects execution_row=90write-through confirmed, no recreate.

Notes for the reviewer

  • The submodule pointer is intentionally NOT bumped in this PR (it would pin an unmerged enterprise branch); the UI gates at runtime via enterprise_features. Bump the pointer when trinity-enterprise#4 merges.
  • ⚠️ Release notes: community soft-delete recovery shrinks (agent 180→5, schedule 30→5).

Related to Abilityai/trinity-enterprise#31

dolho and others added 3 commits June 4, 2026 15:39
…ap (#1039)

OSS floor for the #1039 retention work (must land first; the enterprise
`retention` module + Settings UI follow on the #847 seam).

Prod bug fix:
- docker-compose.prod.yml omitted LOG_RETENTION_DAYS / LOG_ARCHIVE_ENABLED /
  LOG_CLEANUP_HOUR from backend.environment. Prod launches standalone (no
  base-compose merge), so operator-set values never reached the container and
  retention silently fell back to the code default. Added the three lines.

Community 5-day floor (was: log 90 / exec-log 30 / exec-row 90 / health 7 /
agent soft-delete 180 / schedule soft-delete 30):
- OPS_SETTINGS_DEFAULTS: the five operator-tunable windows default to 5.
- LOG_RETENTION_DAYS default 5 (log_archive_service, logs.py, docker-compose.yml).
- Audit log EXEMPT — keeps the 365-day integrity floor (audit_retention_service).
- New COMMUNITY_RETENTION_FLOOR_DAYS / RETENTION_OPS_KEYS constants (the
  enterprise module reuses these to clamp unentitled writes).

Read surface:
- GET /api/settings/retention (admin) reports the effective windows in use +
  the active edition (community vs enterprise via the `retention` entitlement)
  + documented precedence (enterprise → env → community-default).

OSS does NOT hard-clamp env/OPS — they remain an unsupported self-host escape
hatch (per the issue: "not a cryptographic lock on a constant"); the clamp is
the enterprise module's managed setter.

Tests: tests/unit/test_retention_floor.py (7) — floor defaults, audit exemption,
read-surface edition + windows. Cleanup/retention suites green (no test pinned
the old defaults).

NOTE: this sharply shortens soft-delete recovery (agent 180→5, schedule 30→5)
on community installs — call out in release notes; enterprise restores it.

Related to #1039

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (#1039 Stage 3)

Adds a Retention tab to Settings (admin-only). Reads the OSS read surface
GET /api/settings/retention (available in every edition) and shows the effective
windows + an edition badge.

- Community: read-only windows at the fixed 5-day floor + an upgrade hint.
- Enterprise (retention entitlement present): editable per-class windows that
  PUT /api/enterprise/retention/config and apply live (no restart). 0 disables
  a sweep; sub-floor values are raised to the floor (mirrors the backend clamp).
- Audit-log window always shown, never editable (365-day integrity floor).

Gating reuses the existing enterprise store (enterpriseStore.isEntitled
('retention')); the tab is visible in both editions (read-only vs editable),
matching the issue's "community shows the fixed default + upgrade hint".

Related to #1039

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ollution (#1039)

The regression-diff CI gate flagged the 4 endpoint tests in test_retention_floor
as new failures under pytest-randomly seeds 12345/67890 (but not 99999) — a
classic test-ordering pollution, not a logic bug.

Root cause: the tests imported `routers.settings` (lazily), which drags
routers/__init__ → routers.agents → `from services.agent_service import
get_agents_by_prefix`. Another unit test (#612) loads services.agent_service
under a fake sys.modules name, so under some orderings that import resolves to
the partial fake module and raises `ImportError: cannot import name
'get_agents_by_prefix'`.

Fix: load routers/settings.py directly from file under a private module name
(spec_from_file_location), bypassing routers/__init__ entirely. settings.py
imports only models/database/dependencies/services.* — none of the polluted
modules — so the load is robust regardless of collection order. Mirrors the
existing conftest EntitlementCls pattern. Also pins LOG_/AUDIT_ env per call so
a polluted process env can't leak in.

Verified: full unit suite under seeds 12345 + 67890 — all 7 test_retention_floor
tests pass; the only remaining failures are the 7 pre-existing base failures
(git_pull_branch, orphaned_execution_recovery), unchanged by this PR.

Related to #1039

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@dolho dolho force-pushed the feature/1039-retention-oss-floor branch from af8175a to b10a57b Compare June 4, 2026 12:41
@dolho dolho marked this pull request as draft June 4, 2026 14:38
@dolho dolho changed the title feat(retention): OSS 5-day community floor + prod-compose LOG_* fix (#1039 — OSS floor) WIP: feat(retention): OSS 5-day community floor + prod-compose LOG_* fix (#1039 — OSS floor) Jun 4, 2026
@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown

⚠️ Nightly unit-suite check skipped — merge conflict against dev.

Resolve by running git merge dev locally and pushing the result. The next nightly run will re-test once the conflict is gone.

…-oss-floor

# Conflicts:
#	docs/memory/architecture.md
#	src/frontend/src/views/Settings.vue
@dolho dolho changed the title WIP: feat(retention): OSS 5-day community floor + prod-compose LOG_* fix (#1039 — OSS floor) feat(retention): OSS 5-day community floor + prod-compose LOG_* fix (#1039 — OSS floor) Jun 23, 2026
@dolho dolho marked this pull request as ready for review June 23, 2026 10:15
@dolho dolho requested a review from vybe June 23, 2026 10:15

@vybe vybe left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validated via /validate-prone blocker. The OSS 5-day floor logic, the prod-compose LOG_* packaging fix, and the tests are all good.

The architecture.md hunk re-introduces enterprise-module disclosure that the just-merged trinity-enterprise#45 cleanup (the current top commit on dev) removed:

Precedence is enterprise retention license (DB setting) → env → 5-day community default … the enterprise retention module is the managed setter.

Per the CLAUDE.md standing rule, public docs describe the generic open-core seam only — no named paid modules or per-module behavior. The guard CI doesn't catch this (its pattern is SIEM/SCIM/2FA/SSO only), so it would silently re-land.

Fix: genericize that one paragraph to the seam, e.g. "an optional entitled override via the entitlement registry (#847) → env → 5-day community default", without naming the enterprise retention module/license.

— posted via /validate-pr

…seam (#1039)

Review feedback (vybe via /validate-pr): the architecture.md hunk re-introduced
named enterprise-module disclosure that the trinity-enterprise#45 cleanup
removed — it named the enterprise `retention` module/license/entitlement as the
managed setter. Per the CLAUDE.md standing rule, public docs describe the
generic open-core seam only (the CI guard's pattern is SIEM/SCIM/2FA/SSO, so it
didn't catch this).

Genericized both spots (the Cleanup Service sweeps precedence note and the
`GET /api/settings/retention` row) to "an optional entitled override via the
entitlement registry (#847) → env → 5-day community default" — no named module,
license, or per-module behavior. OSS floor logic, prod-compose LOG_* fix, and
tests unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@dolho

dolho commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

Addressed the blocker in 33bf206. Genericized the retention precedence to the open-core seam in both spots — the Cleanup Service sweeps note and the GET /api/settings/retention endpoint row (the latter had the same retention entitlement disclosure):

Precedence: an optional entitled override (via the entitlement registry, #847) → env → 5-day community default; OSS does not hard-clamp.

No named enterprise module / license / per-module behavior remains (verified by grep). OSS floor logic, the prod-compose LOG_* fix, and tests are unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants