Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ The format is intentionally lightweight and human-readable. Group entries by rel
- Added `GET /api/operator-events` plus operator-event metrics for update checks and helper-driven auto-update attempts
- Added dashboard cards and tables for operator-side update checks and apply attempts
- Added provider-health rollout guardrails so helper-driven auto-updates can block when gateway health is already degraded
- Added `update_check.release_channel` and `auto_update.rollout_ring` so operators can distinguish stable vs preview checks and tighter rollout rings

## v0.6.0 - 2026-03-12

Expand Down
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -522,6 +522,7 @@ Supported fields in `update_check`:
- `api_base`
- `timeout_seconds`
- `check_interval_seconds`
- `release_channel`

Example:

Expand All @@ -532,6 +533,7 @@ update_check:
api_base: "https://api.github.com"
timeout_seconds: 5
check_interval_seconds: 21600
release_channel: "stable"
```

The status is exposed through `GET /api/update`, the dashboard, and the helper script `foundrygate-update-check`.
Expand All @@ -543,6 +545,7 @@ Supported fields in `auto_update`:

- `enabled`
- `allow_major`
- `rollout_ring`
- `require_healthy_providers`
- `max_unhealthy_providers`
- `apply_command`
Expand All @@ -553,6 +556,7 @@ Example:
auto_update:
enabled: true
allow_major: false
rollout_ring: "early"
require_healthy_providers: true
max_unhealthy_providers: 0
apply_command: "foundrygate-update"
Expand All @@ -564,6 +568,7 @@ What the current runtime does with it:
- shows the same state in the dashboard
- lets `foundrygate-auto-update --apply` run only when the current release state is eligible
- can block helper-driven rollout when provider health is already degraded
- lets operators separate `stable` vs `preview` release checks and `stable` / `early` / `canary` rollout rings

What it still does not do:

Expand Down
2 changes: 2 additions & 0 deletions config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -881,13 +881,15 @@ update_check:
api_base: "https://api.github.com"
timeout_seconds: 5
check_interval_seconds: 21600
release_channel: "stable"

# ── Optional Auto-Update Enabler ────────────────────────────────────────────
# This does not make the API mutate the checkout. It only marks whether the
# current release status is eligible for a helper-driven update command.
auto_update:
enabled: false
allow_major: false
rollout_ring: "early"
require_healthy_providers: true
max_unhealthy_providers: 0
apply_command: "foundrygate-update"
Expand Down
2 changes: 1 addition & 1 deletion docs/FOUNDRYGATE-ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -361,7 +361,7 @@ Current baseline:
- local helper access via `foundrygate-update-check`
- opt-in eligibility reporting and helper-driven apply flow via `foundrygate-auto-update`

This should remain opt-in and operationally conservative as it expands toward scheduled helper use, stronger rollout controls, and clearer operator approval boundaries.
This should remain opt-in and operationally conservative as it expands toward scheduled helper use, stronger rollout controls, clearer operator approval boundaries, and small rollout-ring/channel distinctions.

### 7. Distribution channels

Expand Down
2 changes: 2 additions & 0 deletions docs/PUBLISHING.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,8 @@ Publishing creates a tagged release. Applying that release on a host should rema
If you want scheduled update application:

- keep `auto_update.enabled: true` explicit in `config.yaml`
- keep `update_check.release_channel` on `stable` unless you intentionally want preview releases in the check path
- keep `auto_update.rollout_ring` on `stable` or `early` for normal environments; use `canary` only for faster adopters
- keep `allow_major: false` unless you are ready to absorb breaking changes automatically
- keep `require_healthy_providers: true` unless you are intentionally allowing rollouts while the gateway is degraded
- prefer the reviewed examples in [examples/foundrygate-auto-update.service](./examples/foundrygate-auto-update.service) and [examples/foundrygate-auto-update.timer](./examples/foundrygate-auto-update.timer)
Expand Down
12 changes: 12 additions & 0 deletions foundrygate/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -865,13 +865,18 @@ def _normalize_update_check(data: dict[str, Any]) -> dict[str, Any]:
if check_interval_seconds <= 0:
raise ConfigError("'update_check.check_interval_seconds' must be positive")

release_channel = raw.get("release_channel", "stable")
if release_channel not in {"stable", "preview"}:
raise ConfigError("'update_check.release_channel' must be 'stable' or 'preview'")

normalized = dict(data)
normalized["update_check"] = {
"enabled": enabled,
"repository": repository.strip(),
"api_base": api_base.strip().rstrip("/"),
"timeout_seconds": float(timeout_seconds),
"check_interval_seconds": check_interval_seconds,
"release_channel": release_channel,
}
return normalized

Expand All @@ -892,6 +897,10 @@ def _normalize_auto_update(data: dict[str, Any]) -> dict[str, Any]:
if not isinstance(allow_major, bool):
raise ConfigError("'auto_update.allow_major' must be a boolean")

rollout_ring = raw.get("rollout_ring", "early")
if rollout_ring not in {"stable", "early", "canary"}:
raise ConfigError("'auto_update.rollout_ring' must be 'stable', 'early', or 'canary'")

require_healthy_providers = raw.get("require_healthy_providers", True)
if not isinstance(require_healthy_providers, bool):
raise ConfigError("'auto_update.require_healthy_providers' must be a boolean")
Expand All @@ -910,6 +919,7 @@ def _normalize_auto_update(data: dict[str, Any]) -> dict[str, Any]:
normalized["auto_update"] = {
"enabled": enabled,
"allow_major": allow_major,
"rollout_ring": rollout_ring,
"require_healthy_providers": require_healthy_providers,
"max_unhealthy_providers": max_unhealthy_providers,
"apply_command": apply_command.strip(),
Expand Down Expand Up @@ -991,6 +1001,7 @@ def update_check(self) -> dict:
"api_base": "https://api.github.com",
"timeout_seconds": 5.0,
"check_interval_seconds": 21600,
"release_channel": "stable",
},
)

Expand All @@ -1001,6 +1012,7 @@ def auto_update(self) -> dict:
{
"enabled": False,
"allow_major": False,
"rollout_ring": "early",
"require_healthy_providers": True,
"max_unhealthy_providers": 0,
"apply_command": "foundrygate-update",
Expand Down
3 changes: 2 additions & 1 deletion foundrygate/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -632,6 +632,7 @@ async def lifespan(app: FastAPI):
api_base=str(_config.update_check.get("api_base", "https://api.github.com")),
check_interval_seconds=int(_config.update_check.get("check_interval_seconds", 21600)),
timeout_seconds=float(_config.update_check.get("timeout_seconds", 5.0)),
release_channel=str(_config.update_check.get("release_channel", "stable")),
auto_update=_config.auto_update,
)

Expand Down Expand Up @@ -1639,7 +1640,7 @@ def main():
<div class="card"><div class="label">Healthy Providers</div><div class="value">${healthyProviders}/${providers.length}</div><div class="detail">${unhealthyProviders} unhealthy</div></div>
<div class="card"><div class="label">Capability Coverage</div><div class="value">${coverageEntries.length}</div><div class="detail">${coverageEntries.map(([name]) => name).slice(0,3).join(', ') || 'none'}</div></div>
<div class="card"><div class="label">Top Modality</div><div class="value">${esc(topModality)}</div><div class="detail">${modalityRows.length} modality groups</div></div>
<div class="card"><div class="label">Release Status</div><div class="value ${(update.alert_level === 'critical' || update.alert_level === 'warning') ? 'err' : update.update_available ? 'cost' : ''}">${esc(update.latest_version || update.current_version || 'n/a')}</div><div class="detail">${update.enabled ? (update.status === 'ok' ? `${esc(update.update_type || 'current')} / ${esc(update.recommended_action || (update.update_available ? 'Upgrade recommended' : 'No action needed'))}${update.auto_update && update.auto_update.enabled ? ` / auto: ${esc(update.auto_update.eligible ? 'eligible' : (update.auto_update.blocked_reason || 'blocked'))}` : ''}` : esc(update.recommended_action || 'Update check unavailable')) : 'Update checks disabled'}</div></div>
<div class="card"><div class="label">Release Status</div><div class="value ${(update.alert_level === 'critical' || update.alert_level === 'warning') ? 'err' : update.update_available ? 'cost' : ''}">${esc(update.latest_version || update.current_version || 'n/a')}</div><div class="detail">${update.enabled ? (update.status === 'ok' ? `${esc(update.release_channel || 'stable')} / ${esc(update.update_type || 'current')} / ${esc(update.recommended_action || (update.update_available ? 'Upgrade recommended' : 'No action needed'))}${update.auto_update && update.auto_update.enabled ? ` / ring: ${esc(update.auto_update.rollout_ring || 'early')} / auto: ${esc(update.auto_update.eligible ? 'eligible' : (update.auto_update.blocked_reason || 'blocked'))}` : ''}` : esc(update.recommended_action || 'Update check unavailable')) : 'Update checks disabled'}</div></div>
<div class="card"><div class="label">Operator Actions</div><div class="value">${fmtTok((operatorEvents.events || []).length)}</div><div class="detail">${latestOperatorEvent ? `${esc(latestOperatorEvent.action || 'update-check')} / ${esc(latestOperatorEvent.status || 'unknown')}` : 'No recent operator events'}</div></div>
`;

Expand Down
70 changes: 65 additions & 5 deletions foundrygate/updates.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,34 @@ def alert_level_for_update(update_type: str, *, available: bool, status: str) ->
return "warning"


def allowed_update_types_for_ring(rollout_ring: str, *, allow_major: bool) -> list[str]:
"""Return the allowed update types for one rollout ring."""
if rollout_ring == "stable":
allowed = ["patch"]
elif rollout_ring == "canary":
allowed = ["patch", "minor"]
else:
allowed = ["patch", "minor"]

if allow_major and rollout_ring == "canary":
allowed.append("major")
return allowed


def select_release_payload(payload: Any, *, release_channel: str) -> dict[str, Any]:
"""Select one release object from the GitHub API payload."""
if release_channel == "preview":
if not isinstance(payload, list):
return {}
for item in payload:
if isinstance(item, dict) and not item.get("draft"):
return item
return {}
if isinstance(payload, dict):
return payload
return {}


def apply_auto_update_guardrails(
auto_update: dict[str, Any],
*,
Expand Down Expand Up @@ -118,6 +146,7 @@ class UpdateStatus:
release_url: str = ""
checked_at: float = 0.0
status: str = "disabled"
release_channel: str = "stable"
update_type: str = "current"
alert_level: str = "disabled"
recommended_action: str = ""
Expand All @@ -134,6 +163,7 @@ def to_dict(self) -> dict[str, Any]:
"release_url": self.release_url,
"checked_at": self.checked_at,
"status": self.status,
"release_channel": self.release_channel,
"update_type": self.update_type,
"alert_level": self.alert_level,
"recommended_action": self.recommended_action,
Expand All @@ -154,6 +184,7 @@ def __init__(
api_base: str = "https://api.github.com",
check_interval_seconds: int = 21600,
timeout_seconds: float = 5.0,
release_channel: str = "stable",
auto_update: dict[str, Any] | None = None,
):
self.current_version = current_version
Expand All @@ -162,9 +193,11 @@ def __init__(
self.api_base = api_base.rstrip("/")
self.check_interval_seconds = check_interval_seconds
self.timeout_seconds = timeout_seconds
self.release_channel = release_channel
self.auto_update = {
"enabled": bool((auto_update or {}).get("enabled", False)),
"allow_major": bool((auto_update or {}).get("allow_major", False)),
"rollout_ring": str((auto_update or {}).get("rollout_ring", "early")),
"require_healthy_providers": bool(
(auto_update or {}).get("require_healthy_providers", True)
),
Expand All @@ -175,6 +208,7 @@ def __init__(
enabled=enabled,
current_version=current_version,
repository=repository,
release_channel=release_channel,
)
self._client = httpx.AsyncClient(
timeout=httpx.Timeout(timeout_seconds, connect=min(timeout_seconds, 5.0)),
Expand All @@ -198,10 +232,9 @@ def _auto_update_status(
"""Return opt-in auto-update eligibility for operator tooling."""
enabled = bool(self.auto_update.get("enabled", False))
allow_major = bool(self.auto_update.get("allow_major", False))
rollout_ring = str(self.auto_update.get("rollout_ring", "early"))
apply_command = str(self.auto_update.get("apply_command", "foundrygate-update"))
allowed_types = ["patch", "minor"]
if allow_major:
allowed_types.append("major")
allowed_types = allowed_update_types_for_ring(rollout_ring, allow_major=allow_major)

blocked_reason = ""
eligible = False
Expand All @@ -223,6 +256,7 @@ def _auto_update_status(
"strategy": "script",
"allowed_update_types": allowed_types,
"allow_major": allow_major,
"rollout_ring": rollout_ring,
"require_healthy_providers": bool(
self.auto_update.get("require_healthy_providers", True)
),
Expand All @@ -243,6 +277,7 @@ async def get_status(self, *, force: bool = False) -> UpdateStatus:
repository=self.repository,
checked_at=time.time(),
status="disabled",
release_channel=self.release_channel,
update_type="current",
alert_level="disabled",
recommended_action="Update checks are disabled",
Expand All @@ -263,7 +298,10 @@ async def get_status(self, *, force: bool = False) -> UpdateStatus:
):
return self._cached

url = f"{self.api_base}/repos/{self.repository}/releases/latest"
if self.release_channel == "preview":
url = f"{self.api_base}/repos/{self.repository}/releases?per_page=10"
else:
url = f"{self.api_base}/repos/{self.repository}/releases/latest"
try:
response = await self._client.get(url)
if response.status_code >= 400:
Expand All @@ -273,6 +311,7 @@ async def get_status(self, *, force: bool = False) -> UpdateStatus:
repository=self.repository,
checked_at=now,
status="unavailable",
release_channel=self.release_channel,
update_type="unknown",
alert_level="warning",
recommended_action="Inspect release connectivity and retry later",
Expand All @@ -285,9 +324,28 @@ async def get_status(self, *, force: bool = False) -> UpdateStatus:
)
return self._cached

payload = response.json()
payload = select_release_payload(response.json(), release_channel=self.release_channel)
latest_version = str(payload.get("tag_name") or "").strip()
release_url = str(payload.get("html_url") or "").strip()
if not latest_version:
self._cached = UpdateStatus(
enabled=True,
current_version=self.current_version,
repository=self.repository,
checked_at=now,
status="unavailable",
release_channel=self.release_channel,
update_type="unknown",
alert_level="warning",
recommended_action="Inspect release connectivity and retry later",
auto_update=self._auto_update_status(
status="unavailable",
update_available=False,
update_type="unknown",
),
error="No release found for the selected channel",
)
return self._cached
update_available = is_update_available(self.current_version, latest_version)
update_type = classify_update(self.current_version, latest_version)
alert_level = alert_level_for_update(
Expand All @@ -304,6 +362,7 @@ async def get_status(self, *, force: bool = False) -> UpdateStatus:
release_url=release_url,
checked_at=now,
status="ok",
release_channel=self.release_channel,
update_type=update_type,
alert_level=alert_level,
recommended_action=(
Expand All @@ -324,6 +383,7 @@ async def get_status(self, *, force: bool = False) -> UpdateStatus:
repository=self.repository,
checked_at=now,
status="unavailable",
release_channel=self.release_channel,
update_type="unknown",
alert_level="warning",
recommended_action="Inspect release connectivity and retry later",
Expand Down
6 changes: 6 additions & 0 deletions tests/test_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,12 @@ def test_auto_update_defaults_are_exposed():
cfg = load_config(Path(__file__).parent.parent / "config.yaml")
assert cfg.auto_update["enabled"] is False
assert cfg.auto_update["allow_major"] is False
assert cfg.auto_update["rollout_ring"] == "early"
assert cfg.auto_update["require_healthy_providers"] is True
assert cfg.auto_update["max_unhealthy_providers"] == 0
assert cfg.auto_update["apply_command"] == "foundrygate-update"


def test_update_check_defaults_include_stable_release_channel():
cfg = load_config(Path(__file__).parent.parent / "config.yaml")
assert cfg.update_check["release_channel"] == "stable"
Loading
Loading