feat(retention): OSS 5-day community floor + prod-compose LOG_* fix (#1039 — OSS floor)#1065
feat(retention): OSS 5-day community floor + prod-compose LOG_* fix (#1039 — OSS floor)#1065dolho wants to merge 5 commits into
Conversation
Now spans all OSS-side stages of Abilityai/trinity-enterprise#31 (Stages 1 + 3); enterprise is trinity-enterprise#4Per the plan to deliver Abilityai/trinity-enterprise#31 end-to-end, this PR now carries both public-repo stages; the enterprise backend lives in its own repo PR. ✅ Stage 1 — OSS floor (this PR)Prod-compose ✅ Stage 3 — frontend (this PR)Settings → Retention tab (admin). Reads ✅ Stage 2 — enterprise module → Abilityai/trinity-enterprise#4Private Live end-to-end (full stack, submodule mounted)
Notes for the reviewer
Related to Abilityai/trinity-enterprise#31 |
…ap (#1039) OSS floor for the #1039 retention work (must land first; the enterprise `retention` module + Settings UI follow on the #847 seam). Prod bug fix: - docker-compose.prod.yml omitted LOG_RETENTION_DAYS / LOG_ARCHIVE_ENABLED / LOG_CLEANUP_HOUR from backend.environment. Prod launches standalone (no base-compose merge), so operator-set values never reached the container and retention silently fell back to the code default. Added the three lines. Community 5-day floor (was: log 90 / exec-log 30 / exec-row 90 / health 7 / agent soft-delete 180 / schedule soft-delete 30): - OPS_SETTINGS_DEFAULTS: the five operator-tunable windows default to 5. - LOG_RETENTION_DAYS default 5 (log_archive_service, logs.py, docker-compose.yml). - Audit log EXEMPT — keeps the 365-day integrity floor (audit_retention_service). - New COMMUNITY_RETENTION_FLOOR_DAYS / RETENTION_OPS_KEYS constants (the enterprise module reuses these to clamp unentitled writes). Read surface: - GET /api/settings/retention (admin) reports the effective windows in use + the active edition (community vs enterprise via the `retention` entitlement) + documented precedence (enterprise → env → community-default). OSS does NOT hard-clamp env/OPS — they remain an unsupported self-host escape hatch (per the issue: "not a cryptographic lock on a constant"); the clamp is the enterprise module's managed setter. Tests: tests/unit/test_retention_floor.py (7) — floor defaults, audit exemption, read-surface edition + windows. Cleanup/retention suites green (no test pinned the old defaults). NOTE: this sharply shortens soft-delete recovery (agent 180→5, schedule 30→5) on community installs — call out in release notes; enterprise restores it. Related to #1039 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (#1039 Stage 3)
Adds a Retention tab to Settings (admin-only). Reads the OSS read surface
GET /api/settings/retention (available in every edition) and shows the effective
windows + an edition badge.
- Community: read-only windows at the fixed 5-day floor + an upgrade hint.
- Enterprise (retention entitlement present): editable per-class windows that
PUT /api/enterprise/retention/config and apply live (no restart). 0 disables
a sweep; sub-floor values are raised to the floor (mirrors the backend clamp).
- Audit-log window always shown, never editable (365-day integrity floor).
Gating reuses the existing enterprise store (enterpriseStore.isEntitled
('retention')); the tab is visible in both editions (read-only vs editable),
matching the issue's "community shows the fixed default + upgrade hint".
Related to #1039
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ollution (#1039) The regression-diff CI gate flagged the 4 endpoint tests in test_retention_floor as new failures under pytest-randomly seeds 12345/67890 (but not 99999) — a classic test-ordering pollution, not a logic bug. Root cause: the tests imported `routers.settings` (lazily), which drags routers/__init__ → routers.agents → `from services.agent_service import get_agents_by_prefix`. Another unit test (#612) loads services.agent_service under a fake sys.modules name, so under some orderings that import resolves to the partial fake module and raises `ImportError: cannot import name 'get_agents_by_prefix'`. Fix: load routers/settings.py directly from file under a private module name (spec_from_file_location), bypassing routers/__init__ entirely. settings.py imports only models/database/dependencies/services.* — none of the polluted modules — so the load is robust regardless of collection order. Mirrors the existing conftest EntitlementCls pattern. Also pins LOG_/AUDIT_ env per call so a polluted process env can't leak in. Verified: full unit suite under seeds 12345 + 67890 — all 7 test_retention_floor tests pass; the only remaining failures are the 7 pre-existing base failures (git_pull_branch, orphaned_execution_recovery), unchanged by this PR. Related to #1039 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
af8175a to
b10a57b
Compare
|
Resolve by running |
…-oss-floor # Conflicts: # docs/memory/architecture.md # src/frontend/src/views/Settings.vue
vybe
left a comment
There was a problem hiding this comment.
Validated via /validate-pr — one blocker. The OSS 5-day floor logic, the prod-compose LOG_* packaging fix, and the tests are all good.
The architecture.md hunk re-introduces enterprise-module disclosure that the just-merged trinity-enterprise#45 cleanup (the current top commit on dev) removed:
Precedence is enterprise
retentionlicense (DB setting) → env → 5-day community default … the enterpriseretentionmodule is the managed setter.
Per the CLAUDE.md standing rule, public docs describe the generic open-core seam only — no named paid modules or per-module behavior. The guard CI doesn't catch this (its pattern is SIEM/SCIM/2FA/SSO only), so it would silently re-land.
Fix: genericize that one paragraph to the seam, e.g. "an optional entitled override via the entitlement registry (#847) → env → 5-day community default", without naming the enterprise retention module/license.
— posted via /validate-pr
…seam (#1039) Review feedback (vybe via /validate-pr): the architecture.md hunk re-introduced named enterprise-module disclosure that the trinity-enterprise#45 cleanup removed — it named the enterprise `retention` module/license/entitlement as the managed setter. Per the CLAUDE.md standing rule, public docs describe the generic open-core seam only (the CI guard's pattern is SIEM/SCIM/2FA/SSO, so it didn't catch this). Genericized both spots (the Cleanup Service sweeps precedence note and the `GET /api/settings/retention` row) to "an optional entitled override via the entitlement registry (#847) → env → 5-day community default" — no named module, license, or per-module behavior. OSS floor logic, prod-compose LOG_* fix, and tests unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Addressed the blocker in 33bf206. Genericized the retention precedence to the open-core seam in both spots — the Cleanup Service sweeps note and the
No named enterprise module / license / per-module behavior remains (verified by grep). OSS floor logic, the prod-compose |
Summary
Stage 1 of Abilityai/trinity-enterprise#31 — the OSS floor (the issue's "must land first" chunk). The enterprise
retentionmodule (private submodule) + Settings → Retention UI follow as separate PRs on the #847 seam.Prod bug fixed (silent no-op)
docker-compose.prod.ymlomittedLOG_RETENTION_DAYS/LOG_ARCHIVE_ENABLED/LOG_CLEANUP_HOURfrombackend.environment:. Prod launches standalone (-f docker-compose.prod.yml, no base-compose merge), so an operator-setLOG_RETENTION_DAYSnever reached the container and retention always fell back to the code default. Added the three lines.5-day community floor
Lowered the operator-tunable defaults to the 5-day community floor:
LOG_RETENTION_DAYSexecution_log_retention_daysexecution_row_retention_dayshealth_check_retention_daysagent_soft_delete_retention_daysschedule_soft_delete_retention_daysaudit_log_retention_daysNew
COMMUNITY_RETENTION_FLOOR_DAYS/RETENTION_OPS_KEYSconstants — the enterprise module reuses these to clamp unentitled writes.Read surface
GET /api/settings/retention(admin) reports the effective windows in use + the active edition (communityvsenterprisevia theretentionentitlement) + documented precedence: enterprise (license) DB setting → env → 5-day community default.Design note
OSS does not hard-clamp env/OPS values — they remain an unsupported self-host escape hatch (per the issue: "not a cryptographic lock on a constant"). The clamp-to-floor lives in the enterprise
retentionmodule's managed setter (Stage 2).This sharply shortens soft-delete recovery on community installs (agent 180→5, schedule 30→5). Deliberate per the issue; an enterprise license restores longer windows.
Verification
tests/unit/test_retention_floor.py— 7/7 (floor defaults, audit exemption, read-surface edition + windows, audit 365-floor, enterprise-set OPS window).test_cleanup_inner_sweeps,test_execution_retention_prune,test_audit_retention_prune,test_agent_cleanup_parity).GET /api/settings/retention→edition: community, OPS windows 5, audit 365; backend recreated with the new compose →LOG_RETENTION_DAYS=5flows through to the container.Remaining Abilityai/trinity-enterprise#31 (follow-up PRs)
retentionmodule (privatetrinity-enterprise):enterprise_retention_configtable on the two-track runner,GET/PUT /api/enterprise/retention/*gated byrequires_entitlement("retention"), live-read write-through, clamp-to-floor when unentitled.retentionentitlement; community shows the fixed 5-day default + upgrade hint.Related to Abilityai/trinity-enterprise#31
🤖 Generated with Claude Code