Fix/terra alignment by JustAGhosT · Pull Request #69 · phoenixvc/sluice

JustAGhosT · 2026-03-15T12:26:13Z

Pull Request Checklist

Summary

What changed?
Why was it needed?

Validation

Local checks run (if applicable)
Relevant workflow/jobs observed

Deployment Notes

No environment/config changes required
Environment/config changes required (describe below)

Staging Toggle (PRs to `main`)

Add label run-staging to this PR to enable staging deployment (deploy-staging).
Remove label run-staging to skip staging deployment.

Risk / Rollback

Risk level: low / medium / high
Rollback plan:

Summary by CodeRabbit

Documentation
- Added Prometheus telemetry and retention guidance, new API routing docs for Azure OpenAI via LiteLLM, Webhook Auth in gateway docs, multiple formatting/clarity updates across architecture references, and a new AGENTS guide.
Improvements
- Enhanced deployment workflows with per-environment settings and environment selector, configurable smoke-test attempt count, expanded optional state/dashboard and monitoring configuration, and support for mapping additional credentials/environment variables.

- Add C3 connection in container architecture flowchart - Rename "Azure Monitor" to "Application Insights" in telemetry diagram - Add Prometheus as a telemetry sink with implementation details - Document retention policies for Application Insights - Update matrix gateway JSON example with new field names - Fix confidence threshold in matrix-gateway from 0.70 to 0.75 - Update SOP suggestion threshold from 0.78 to 0.8 - Fix code block formatting in multiple files

Add environment input parameter to deploy-environment.yaml workflow to specify GitHub environment, improving deployment control and security. Replace hardcoded environment settings in deploy.yaml with the new parameter. Also fix code fences in documentation to use text format and update various documentation details.

…ttempts and update default value

…ttempts and update default value # Pull Request Checklist ## Summary - What changed? - Why was it needed? ## Validation - [ ] Local checks run (if applicable) - [ ] Relevant workflow/jobs observed ## Deployment Notes - [ ] No environment/config changes required - [ ] Environment/config changes required (describe below) ## UAT Toggle (PRs to `main`) - Add label `run-uat` to this PR to enable UAT deployment (`deploy-uat`). - Remove label `run-uat` to skip UAT deployment. ## Risk / Rollback - Risk level: low / medium / high - Rollback plan:

blocksorg · 2026-03-15T12:26:16Z

Mention Blocks like a regular teammate with your question or request:

@blocks review this pull request
@blocks make the following changes ...
@blocks create an issue from what was mentioned in the following comment ...
@blocks explain the following code ...
@blocks are there any security or performance concerns?

Run @blocks /help for more information.

Workspace settings | Disable this message

chatgpt-codex-connector · 2026-03-15T12:26:18Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

coderabbitai · 2026-03-15T12:26:29Z

📝 Walkthrough

Walkthrough

Updates CI/CD workflows to add new inputs, per-job environments, secrets, and TF_VAR mappings (Azure auth, Terraform backend, state/dashboard images, Grafana, EXPECTED_AOAI_ENDPOINT_HOST), renames smoke test wait parameter, and applies related infra and documentation adjustments.

Changes

Cohort / File(s)	Summary
GitHub Actions: deploy workflows `\.github/workflows/deploy-environment.yaml`, `\.github/workflows/deploy.yaml`	Added inputs (`smoke_models_wait_attempts`, `environment`, state/dashboard TF_VARs), new per-job secrets (AZURE_, TF_BACKEND_, EXPECTED_AOAI_ENDPOINT_HOST, STATE_SERVICE_*, DASHBOARD/GRAFANA), mapped env vars/TF_VARs, set job-level environment, and updated smoke-test step to use the new attempts input.
Infra — Application Insights output `infra/modules/aigateway_aca/outputs.tf`	Replaced sensitive `application_insights_connection_string` output with non-sensitive `application_insights_name`; guidance updated to retrieve connection string from Key Vault.
Infra — state/dashboard token trim change `infra/modules/dashboard_aca/main.tf`, `infra/modules/state_service_aca/main.tf`	Changed whitespace trimming from `trim` to `trimspace` for `use_shared_token` checks, altering inclusion logic for shared-token secrets/env.
Infra env vars `infra/env/dev/terraform.tfvars`	Added `state_service_container_image`, `state_service_registry_username`, `state_service_registry_password` assignments for dev environment.
Docs — container & observability `docs/architecture/02-container-architecture.md`, `docs/architecture/04-observability-telemetry.md`	Added Webhook Auth node/edges to gateway diagram; introduced Prometheus as a telemetry sink with /metrics callbacks and retention guidance for Application Insights (Prod 90d, Non-prod 30d).
Docs — systems & routing `docs/architecture/systems/*`, `docs/architecture/systems/ai-gateway.md`	Standardized code fence language to `text` and added a v1 API routing section describing Azure OpenAI via LiteLLM and routing rules for /v1/responses and /v1/embeddings.
Docs — reference & policies `docs/architecture/reference/*`, `docs/architecture/reference/strategic/07-deployment-model.md`	Updated example payload fields (`request_id`, `label`, `recommended_tier`), added migration note, tightened thresholds (0.70→0.75, 0.78→0.80), expanded fallback pattern, and converted code fences to `text`.
Docs — planning `docs/planning/request_to_token_attribution.md`	Terminology and labeling updates (Required→Recommended, Method A → Preferred), clarified Application Insights wiring via Key Vault, and updated action items.
Docs — minor formatting/others `docs/architecture/reference/slm-`, `docs/architecture/systems/`, `docs/architecture/systems/codeflow-engine.md`	Converted several code fences to `text`, small formatting tweaks, and minor content adjustments (escaping braces, wording changes).
Ops guide `AGENTS.md`	Added AGENTS.md with developer guidance for AI coding agents, tech stack, build/test commands, and style conventions.

Sequence Diagram(s)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Fix/grafana env local #54: Overlaps on dashboard/Grafana and state_service wiring (TF_VARs and workflow dashboard steps).
Feat/comprehensive cicd multi env 810071297942926866 #9: Related edits to GitHub Actions deploy workflows and per-environment inputs/secrets.
feat: add Azure OpenAI endpoint validation and deployment discovery to smoke test #23: Adds/uses EXPECTED_AOAI_ENDPOINT_HOST and smoke-test AOAI validation—directly connected to the new secret here.

Poem

🐇 I hop through YAML, neat and spry,
Secrets tucked where pipelines lie,
Grafana hums and metrics chime,
Key Vault keeps the strings in time,
I nibble bugs and push — goodbye! 🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description only contains an empty template with no substantive content answering what changed, why it was needed, or addressing validation and risk assessment.	Fill in the checklist sections with concrete details about what changed, why changes were needed, validation steps performed, and risk assessment for the deployment.
Title check	❓ Inconclusive	The title 'Fix/terra alignment' is vague and uses generic phrasing that doesn't clearly convey what specific alignment issue is being fixed.	Provide a more specific and descriptive title that clearly identifies the main change (e.g., 'Align Terraform configuration and GitHub workflow variables' or similar).

✅ Passed checks (1 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/terra-alignment

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

docs/architecture/reference/matrix-gateway.md (1)
48-49: Document migration impact for renamed response fields.

recommended_tier/cacheable look consistent here, but this is a contract change for clients. Please add a short migration note (old → new fields, deprecation window/version) in this doc section to reduce integration risk.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/reference/matrix-gateway.md` around lines 48 - 49, Add a
short migration note in the Matrix Gateway reference section describing the
rename/contract change for the response fields: explicitly state the old field
names (e.g., if any) and the new fields recommended_tier and cacheable, show
example mapping (old → new), and declare the deprecation window/version and
recommended upgrade steps for clients; update the section near the response
schema so consumers know the change, timeline, and how to handle backwards
compatibility (e.g., fallback logic or versioned endpoints).
docs/architecture/02-container-architecture.md (1)
60-60: Model webhook verification explicitly in the flow.

Line 60 introduces direct GitHub Webhook ingress; consider adding a distinct “signature verification/auth” step in the diagram so trust boundaries are explicit.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/02-container-architecture.md` at line 60, The diagram
currently connects C3 directly to G1 via "C3 --> G1"; insert an explicit
verification/auth step between them (e.g., create a node like "W_Verify" or
"WebhookAuth" and change the flow to "C3 --> WebhookAuth --> G1") and label that
node to indicate signature validation/authorization so the trust boundary for
GitHub webhook ingress is clear in the diagram and documentation.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/deploy-environment.yaml:
- Around line 31-35: Add a missing input declaration for smoke_models_wait_sleep
to the reusable workflow inputs so inputs.smoke_models_wait_sleep is defined;
specifically add an input block named smoke_models_wait_sleep (e.g., required:
false, type: string, default: "10", description: Sleep seconds between model
availability checks) alongside the existing smoke_models_wait_attempts
declaration so callers that pass smoke_models_wait_sleep continue to work.

In `@docs/architecture/04-observability-telemetry.md`:
- Line 79: The description reverses Prometheus' role: update the sentence that
says "The Prometheus exporter exposes a `/metrics` endpoint that scrapes
application metrics" to indicate that Prometheus scrapes the `/metrics` endpoint
(e.g., "The Prometheus exporter exposes a `/metrics` endpoint which is scraped
by Prometheus") and ensure the surrounding text still references
`success_callback`/`failure_callback` containing "prometheus" and the container
config at `infra/modules/aigateway_aca/main.tf:95-113`.

---

Nitpick comments:
In `@docs/architecture/02-container-architecture.md`:
- Line 60: The diagram currently connects C3 directly to G1 via "C3 --> G1";
insert an explicit verification/auth step between them (e.g., create a node like
"W_Verify" or "WebhookAuth" and change the flow to "C3 --> WebhookAuth --> G1")
and label that node to indicate signature validation/authorization so the trust
boundary for GitHub webhook ingress is clear in the diagram and documentation.

In `@docs/architecture/reference/matrix-gateway.md`:
- Around line 48-49: Add a short migration note in the Matrix Gateway reference
section describing the rename/contract change for the response fields:
explicitly state the old field names (e.g., if any) and the new fields
recommended_tier and cacheable, show example mapping (old → new), and declare
the deprecation window/version and recommended upgrade steps for clients; update
the section near the response schema so consumers know the change, timeline, and
how to handle backwards compatibility (e.g., fallback logic or versioned
endpoints).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e51ac272-e68f-4e8e-a52c-d8a36251d57f

📥 Commits

Reviewing files that changed from the base of the PR and between 2bc7860 and 0e4fff2.

📒 Files selected for processing (16)

.github/workflows/deploy-environment.yaml
.github/workflows/deploy.yaml
docs/architecture/02-container-architecture.md
docs/architecture/04-observability-telemetry.md
docs/architecture/reference/matrix-gateway.md
docs/architecture/reference/matrix-rooivalk.md
docs/architecture/reference/slm-implementation-matrix.md
docs/architecture/reference/slm-management-plan.md
docs/architecture/reference/strategic/07-deployment-model.md
docs/architecture/systems/agentkit-forge.md
docs/architecture/systems/ai-gateway.md
docs/architecture/systems/codeflow-engine.md
docs/architecture/systems/cognitive-mesh.md
docs/architecture/systems/phoenix-rooivalk.md
docs/planning/request_to_token_attribution.md
infra/modules/aigateway_aca/outputs.tf

.github/workflows/deploy-environment.yaml

docs/architecture/04-observability-telemetry.md

- Add new workflow parameter to control sleep between model availability checks - Update container architecture diagram to include Webhook Auth component - Improve Prometheus metrics documentation clarity - Add migration note for Matrix Gateway response contract changes in v1.0.0

coderabbitai

🧹 Nitpick comments (4)

infra/modules/state_service_aca/main.tf (1)
46-52: Consider trimming the secret value for consistency.

The use_shared_token check now correctly uses trimspace() to detect non-empty tokens, but the secret value itself (line 50) still stores the raw var.state_service_shared_token which may contain trailing newlines. This could cause authentication mismatches if the consuming service performs a strict comparison.
♻️ Proposed fix to trim the secret value
   dynamic "secret" {
     for_each = local.use_shared_token ? [1] : []
     content {
       name  = "state-service-shared-token"
-      value = var.state_service_shared_token
+      value = trimspace(var.state_service_shared_token)
     }
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infra/modules/state_service_aca/main.tf` around lines 46 - 52, The secret
value in the dynamic "secret" block uses the raw var.state_service_shared_token
which may include trailing whitespace/newlines; update the content.value to pass
the variable through trimspace (i.e., use
trimspace(var.state_service_shared_token)) so the stored secret is normalized,
matching the earlier use of trimspace() in local.use_shared_token and avoiding
auth mismatches.
infra/modules/dashboard_aca/main.tf (1)
37-43: Same concern: trim the secret value for consistency.

As noted in infra/modules/state_service_aca/main.tf, the secret value should also be trimmed to avoid storing whitespace-padded tokens.
♻️ Proposed fix to trim the secret value
   dynamic "secret" {
     for_each = local.use_shared_token ? [1] : []
     content {
       name  = "state-service-shared-token"
-      value = var.state_service_shared_token
+      value = trimspace(var.state_service_shared_token)
     }
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infra/modules/dashboard_aca/main.tf` around lines 37 - 43, The dynamic
"secret" block that creates the "state-service-shared-token" currently uses
value = var.state_service_shared_token and may store whitespace-padded tokens;
update the content to use Terraform's trimspace function (value =
trimspace(var.state_service_shared_token)) so the stored secret is normalized;
this change should be made inside the dynamic "secret" block that is conditional
on local.use_shared_token to match the fix in the state_service_aca module.
.github/workflows/deploy-environment.yaml (2)
109-109: Hardcoded secrets expiration date may require future maintenance.

The value "2027-03-31T00:00:00Z" is hardcoded in the workflow. While currently valid (per the Terraform validation in infra/env/dev/variables.tf:88-95), this will require a workflow update before March 2027 to avoid deployment failures when the date is no longer in the future.

Consider parameterizing this as an input or secret to allow environment-specific configuration without workflow changes.
♻️ Suggested parameterization
+      secrets_expiration_date:
+        required: false
+        type: string
+        default: "2027-03-31T00:00:00Z"
+        description: Expiration date for Key Vault secrets (ISO-8601 UTC format)
...
-  TF_VAR_secrets_expiration_date: "2027-03-31T00:00:00Z"
+  TF_VAR_secrets_expiration_date: ${{ inputs.secrets_expiration_date }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/deploy-environment.yaml at line 109, The workflow
hardcodes TF_VAR_secrets_expiration_date which will become invalid in 2027;
change it to a configurable input or secret so it can be updated without editing
the workflow: replace the literal "2027-03-31T00:00:00Z" with a GitHub Actions
input (e.g., workflow_dispatch/input) or a repo/organization secret and fall
back to a computed value if not provided; update any usages referencing
TF_VAR_secrets_expiration_date accordingly so the workflow reads the value from
inputs or secrets instead of the hardcoded string.
110-110: Consider pinning dashboard image version in fallback.

The fallback ghcr.io/phoenixvc/ai-gateway-dashboard:latest uses the :latest tag, which can lead to unpredictable behavior if the image changes between deployments. While this is only used when DASHBOARD_CONTAINER_IMAGE secret is not configured, consider using a pinned version or SHA digest for reproducibility.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/deploy-environment.yaml at line 110, The fallback for
TF_VAR_dashboard_container_image currently uses the mutable tag ':latest';
update the fallback value used when DASHBOARD_CONTAINER_IMAGE is not provided to
a pinned image tag or SHA digest (e.g., replace
'ghcr.io/phoenixvc/ai-gateway-dashboard:latest' with a specific version tag or
`@sha256` digest) so deployments are reproducible; modify the
TF_VAR_dashboard_container_image assignment in the workflow to reference the
pinned tag/digest instead of ':latest' while keeping the secret override via
DASHBOARD_CONTAINER_IMAGE.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/deploy-environment.yaml:
- Line 109: The workflow hardcodes TF_VAR_secrets_expiration_date which will
become invalid in 2027; change it to a configurable input or secret so it can be
updated without editing the workflow: replace the literal "2027-03-31T00:00:00Z"
with a GitHub Actions input (e.g., workflow_dispatch/input) or a
repo/organization secret and fall back to a computed value if not provided;
update any usages referencing TF_VAR_secrets_expiration_date accordingly so the
workflow reads the value from inputs or secrets instead of the hardcoded string.
- Line 110: The fallback for TF_VAR_dashboard_container_image currently uses the
mutable tag ':latest'; update the fallback value used when
DASHBOARD_CONTAINER_IMAGE is not provided to a pinned image tag or SHA digest
(e.g., replace 'ghcr.io/phoenixvc/ai-gateway-dashboard:latest' with a specific
version tag or `@sha256` digest) so deployments are reproducible; modify the
TF_VAR_dashboard_container_image assignment in the workflow to reference the
pinned tag/digest instead of ':latest' while keeping the secret override via
DASHBOARD_CONTAINER_IMAGE.

In `@infra/modules/dashboard_aca/main.tf`:
- Around line 37-43: The dynamic "secret" block that creates the
"state-service-shared-token" currently uses value =
var.state_service_shared_token and may store whitespace-padded tokens; update
the content to use Terraform's trimspace function (value =
trimspace(var.state_service_shared_token)) so the stored secret is normalized;
this change should be made inside the dynamic "secret" block that is conditional
on local.use_shared_token to match the fix in the state_service_aca module.

In `@infra/modules/state_service_aca/main.tf`:
- Around line 46-52: The secret value in the dynamic "secret" block uses the raw
var.state_service_shared_token which may include trailing whitespace/newlines;
update the content.value to pass the variable through trimspace (i.e., use
trimspace(var.state_service_shared_token)) so the stored secret is normalized,
matching the earlier use of trimspace() in local.use_shared_token and avoiding
auth mismatches.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cfbc39a6-c9cd-4b1a-819a-6b28a9ebaa94

📥 Commits

Reviewing files that changed from the base of the PR and between 0e4fff2 and 19759df.

📒 Files selected for processing (6)

.github/workflows/deploy-environment.yaml
docs/architecture/02-container-architecture.md
docs/architecture/04-observability-telemetry.md
docs/architecture/reference/matrix-gateway.md
infra/modules/dashboard_aca/main.tf
infra/modules/state_service_aca/main.tf

🚧 Files skipped from review as they are similar to previous changes (2)

docs/architecture/reference/matrix-gateway.md
docs/architecture/02-container-architecture.md

Add comprehensive documentation for AI coding agents working in this repository, including: - Project overview and tech stack - Build/lint/test commands for each component - Code style guidelines for Python, JavaScript, and Terraform - Architecture overview and key files - Prerequisites and pre-commit checks

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@AGENTS.md`:
- Around line 186-197: The fenced code block containing the directory tree
starting with "docs/architecture/" is missing a language identifier and will
fail MD040; edit the block in AGENTS.md that begins with the triple backticks
before "docs/architecture/" and add a language tag (e.g., text) so the fence
reads ```text, leaving the directory content unchanged, then save the file to
satisfy the linter.
- Around line 78-79: The guidance contradicts itself: the sentence "Use absolute
imports within packages" conflicts with the example that uses a relative import
(from .routes import router). Update AGENTS.md so the rule and example
match—either change the rule to recommend relative imports within packages and
keep the example referencing .routes and router, or replace the example with an
absolute import form that references the package-level routes module and router
symbol; ensure the grouping guideline (stdlib → third-party → local) remains
consistent with the chosen import style.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d30d3a92-61be-43e3-bc4b-f5538a5e6c10

📥 Commits

Reviewing files that changed from the base of the PR and between 19759df and 9be91e5.

📒 Files selected for processing (1)

AGENTS.md

coderabbitai · 2026-03-15T14:39:48Z

AGENTS.md

+- Use absolute imports within packages: `from .routes import router`
+- Group imports: stdlib → third-party → local


⚠️ Potential issue | 🟡 Minor

Fix contradictory Python import guidance.

The text says to use absolute imports, but the example is a relative import (from .routes import router). Please align the rule and example to the same convention.

Suggested doc fix

-- Use absolute imports within packages: `from .routes import router` +- Use relative imports within packages: `from .routes import router`

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- Use absolute imports within packages: `from .routes import router`

- Group imports: stdlib → third-party → local

- Use relative imports within packages: `from .routes import router`

- Group imports: stdlib → third-party → local

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@AGENTS.md` around lines 78 - 79, The guidance contradicts itself: the sentence "Use absolute imports within packages" conflicts with the example that uses a relative import (from .routes import router). Update AGENTS.md so the rule and example match—either change the rule to recommend relative imports within packages and keep the example referencing .routes and router, or replace the example with an absolute import form that references the package-level routes module and router symbol; ensure the grouping guideline (stdlib → third-party → local) remains consistent with the chosen import style.

coderabbitai · 2026-03-15T14:39:48Z

AGENTS.md

+```
+docs/architecture/
+├── systems/          # Individual system documentation
+├── reference/        # Reference and planning docs
+│   └── strategic/   # Strategic guidance
+├── 01-*-*.md       # ADR-style documents
+
+dashboard/           # Admin UI (Node.js/pnpm)
+infra/              # Terraform IaC
+scripts/            # Deployment automation
+state-service/      # FastAPI state service
+```


⚠️ Potential issue | 🟡 Minor

Add a language identifier to the architecture code fence.

This fenced block is missing a language tag and will fail MD040 lint checks.

Suggested doc fix

-``` +```text docs/architecture/ ├── systems/ # Individual system documentation ├── reference/ # Reference and planning docs │ └── strategic/ # Strategic guidance ├── 01-*-*.md # ADR-style documents @@ state-service/ # FastAPI state service

</details> <details> <summary>🧰 Tools</summary> <details> <summary>🪛 markdownlint-cli2 (0.21.0)</summary> [warning] 186-186: Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> <details> <summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @AGENTS.md around lines 186 - 197, The fenced code block containing the
directory tree starting with "docs/architecture/" is missing a language
identifier and will fail MD040; edit the block in AGENTS.md that begins with the
triple backticks before "docs/architecture/" and add a language tag (e.g., text)
so the fence reads ```text, leaving the directory content unchanged, then save
the file to satisfy the linter.

</details>  

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

infra/env/dev/terraform.tfvars (1)

28-28: Avoid using :latest tag for container images in infrastructure code.

The :latest tag is non-deterministic and can lead to unexpected behavior when the image is updated. Consider pinning to a specific version or SHA digest for reproducible deployments.

♻️ Suggested improvement

-state_service_container_image   = "ghcr.io/phoenixvc/ai-gateway-state-service:latest"
+state_service_container_image   = "ghcr.io/phoenixvc/ai-gateway-state-service:v1.0.0"  # Pin to specific version

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@infra/env/dev/terraform.tfvars` at line 28, The image pinning uses a
non-deterministic :latest tag for state_service_container_image; replace the
"ghcr.io/phoenixvc/ai-gateway-state-service:latest" value in terraform.tfvars
with a specific immutable reference (either a versioned tag like v1.2.3 or a SHA
digest e.g. `@sha256`:...) to ensure reproducible deployments and update any
deployment automation that consumes state_service_container_image to accept the
pinned tag; reference the state_service_container_image variable when making the
change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@infra/env/dev/terraform.tfvars`:
- Around line 29-30: Remove the hardcoded credentials by deleting the
state_service_registry_username and state_service_registry_password assignments
from terraform.tfvars (they are currently set to "phoenixvc" and "ghp_xxx");
rely on the module's conditional auth flag use_registry_auth and supply
credentials at runtime via environment variables (e.g., export
TF_VAR_state_service_registry_username and
TF_VAR_state_service_registry_password in CI/CD) so no secrets are committed to
the repo.

---

Nitpick comments:
In `@infra/env/dev/terraform.tfvars`:
- Line 28: The image pinning uses a non-deterministic :latest tag for
state_service_container_image; replace the
"ghcr.io/phoenixvc/ai-gateway-state-service:latest" value in terraform.tfvars
with a specific immutable reference (either a versioned tag like v1.2.3 or a SHA
digest e.g. `@sha256`:...) to ensure reproducible deployments and update any
deployment automation that consumes state_service_container_image to accept the
pinned tag; reference the state_service_container_image variable when making the
change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d83fd7b1-d5b5-4046-87d4-65b3a4222721

📥 Commits

Reviewing files that changed from the base of the PR and between 9be91e5 and 72c004c.

📒 Files selected for processing (1)

infra/env/dev/terraform.tfvars

coderabbitai · 2026-03-15T15:27:08Z

infra/env/dev/terraform.tfvars

+state_service_registry_username = "phoenixvc"
+state_service_registry_password = "ghp_xxx"


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Search for potential hardcoded credentials in tfvars files # Expected: No matches for credential patterns like ghp_, password =, token =, secret = echo "=== Checking for GitHub PAT patterns ===" rg -n 'ghp_[a-zA-Z0-9_]+' --glob '*.tfvars' echo "=== Checking for password/token assignments with non-empty values ===" rg -nP '(password|token|secret)\s*=\s*"[^"]+[^"]+"' --glob '*.tfvars' | grep -v '""'

Repository: phoenixvc/ai-gateway

Length of output: 330

🏁 Script executed:

#!/bin/bash # Find all .tf files in infra directory echo "=== Finding all .tf files in infra directory ===" find infra -name "*.tf" -type f echo "" echo "=== Searching for state_service_registry_password variable definition ===" rg -n 'state_service_registry_password' --type tf -B 2 -A 2 echo "" echo "=== Searching for state_service_registry_username variable definition ===" rg -n 'state_service_registry_username' --type tf -B 2 -A 2

Repository: phoenixvc/ai-gateway

Length of output: 5517

🏁 Script executed:

#!/bin/bash echo "=== Reading dev/variables.tf for state_service_registry variables ===" sed -n '189,210p' infra/env/dev/variables.tf echo "" echo "=== Checking if there are any defaults in these variable blocks ===" rg -A 10 'variable "state_service_registry_(username|password)"' infra/env/dev/variables.tf

Repository: phoenixvc/ai-gateway

Length of output: 1330

🏁 Script executed:

#!/bin/bash echo "=== Reading state_service_aca module main.tf to see credential usage ===" cat -n infra/modules/state_service_aca/main.tf echo "" echo "=== Checking state_service_aca variables.tf ===" cat -n infra/modules/state_service_aca/variables.tf

Repository: phoenixvc/ai-gateway

Length of output: 6408

Remove hardcoded credentials from terraform.tfvars.

The state_service_registry_password value (ghp_xxx) appears to be a GitHub Personal Access Token placeholder. Hardcoding credentials in tfvars files committed to version control is a security risk:

The pattern may be mistaken for a real (redacted) token

Developers may replace it with real credentials and accidentally commit

These variables have defaults of "" and the module gracefully handles empty credentials via conditional logic (use_registry_auth), so they can be safely omitted

Remove these values from the tfvars file and provide them at runtime via CI/CD environment variables (e.g., TF_VAR_state_service_registry_password).

Proposed fix

# State Service state_service_container_image = "ghcr.io/phoenixvc/ai-gateway-state-service:latest" -state_service_registry_username = "phoenixvc" -state_service_registry_password = "ghp_xxx" +# Registry credentials should be provided via TF_VAR_* environment variables +# state_service_registry_username = "" # Set via TF_VAR_state_service_registry_username +# state_service_registry_password = "" # Set via TF_VAR_state_service_registry_password (sensitive)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

state_service_registry_username = "phoenixvc"

state_service_registry_password = "ghp_xxx"

# State Service

state_service_container_image = "ghcr.io/phoenixvc/ai-gateway-state-service:latest"

# Registry credentials should be provided via TF_VAR_* environment variables

# state_service_registry_username = "" # Set via TF_VAR_state_service_registry_username

# state_service_registry_password = "" # Set via TF_VAR_state_service_registry_password (sensitive)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@infra/env/dev/terraform.tfvars` around lines 29 - 30, Remove the hardcoded credentials by deleting the state_service_registry_username and state_service_registry_password assignments from terraform.tfvars (they are currently set to "phoenixvc" and "ghp_xxx"); rely on the module's conditional auth flag use_registry_auth and supply credentials at runtime via environment variables (e.g., export TF_VAR_state_service_registry_username and TF_VAR_state_service_registry_password in CI/CD) so no secrets are committed to the repo.

JustAGhosT · 2026-03-15T15:28:32Z

@claude help here Verify each finding against the current code and only fix it if needed.

Inline comments:
In @infra/env/dev/terraform.tfvars:

Around line 29-30: Remove the hardcoded credentials by deleting the
state_service_registry_username and state_service_registry_password assignments
from terraform.tfvars (they are currently set to "phoenixvc" and "ghp_xxx");
rely on the module's conditional auth flag use_registry_auth and supply
credentials at runtime via environment variables (e.g., export
TF_VAR_state_service_registry_username and
TF_VAR_state_service_registry_password in CI/CD) so no secrets are committed to
the repo.

Nitpick comments:
In @infra/env/dev/terraform.tfvars:

Line 28: The image pinning uses a non-deterministic :latest tag for
state_service_container_image; replace the
"ghcr.io/phoenixvc/ai-gateway-state-service:latest" value in terraform.tfvars
with a specific immutable reference (either a versioned tag like v1.2.3 or a SHA
digest e.g. @sha256:...) to ensure reproducible deployments and update any
deployment automation that consumes state_service_container_image to accept the
pinned tag; reference the state_service_container_image variable when making the
change.

JustAGhosT added 4 commits March 15, 2026 13:29

fix(workflows): rename smoke_models_wait_sleep to smoke_models_wait_a…

02a8bb7

…ttempts and update default value

coderabbitai bot reviewed Mar 15, 2026

View reviewed changes

.github/workflows/deploy-environment.yaml Show resolved Hide resolved

docs/architecture/04-observability-telemetry.md Outdated Show resolved Hide resolved

JustAGhosT had a problem deploying to dev March 15, 2026 12:48 — with GitHub Actions Failure

fix: spacing issues in yaml

19759df

JustAGhosT temporarily deployed to dev March 15, 2026 13:00 — with GitHub Actions Inactive

JustAGhosT had a problem deploying to dev March 15, 2026 13:02 — with GitHub Actions Failure

coderabbitai bot reviewed Mar 15, 2026

View reviewed changes

JustAGhosT temporarily deployed to dev March 15, 2026 14:37 — with GitHub Actions Inactive

JustAGhosT had a problem deploying to dev March 15, 2026 14:38 — with GitHub Actions Failure

coderabbitai bot reviewed Mar 15, 2026

View reviewed changes

feat(infra): add state service configuration to dev environment

72c004c

JustAGhosT deployed to dev March 15, 2026 15:23 — with GitHub Actions Active

JustAGhosT had a problem deploying to dev March 15, 2026 15:24 — with GitHub Actions Failure

coderabbitai bot reviewed Mar 15, 2026

View reviewed changes

		- Use absolute imports within packages: `from .routes import router`
		- Group imports: stdlib → third-party → local

		state_service_registry_username = "phoenixvc"
		state_service_registry_password = "ghp_xxx"

-state_service_registry_username = "phoenixvc"
-state_service_registry_password = "ghp_xxx"
+# State Service
+state_service_container_image   = "ghcr.io/phoenixvc/ai-gateway-state-service:latest"
+# Registry credentials should be provided via TF_VAR_* environment variables
+# state_service_registry_username = ""  # Set via TF_VAR_state_service_registry_username
+# state_service_registry_password = ""  # Set via TF_VAR_state_service_registry_password (sensitive)

Conversation

JustAGhosT commented Mar 15, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Checklist

Summary

Validation

Deployment Notes

Staging Toggle (PRs to main)

Risk / Rollback

Summary by CodeRabbit

Uh oh!

blocksorg bot commented Mar 15, 2026

Uh oh!

chatgpt-codex-connector bot commented Mar 15, 2026

Uh oh!

coderabbitai bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

JustAGhosT commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JustAGhosT commented Mar 15, 2026 •

edited by coderabbitai bot

Loading

Staging Toggle (PRs to `main`)

coderabbitai bot commented Mar 15, 2026 •

edited

Loading