Skip to content

Fix/terra alignment#69

Open
JustAGhosT wants to merge 8 commits intodevfrom
fix/terra-alignment
Open

Fix/terra alignment#69
JustAGhosT wants to merge 8 commits intodevfrom
fix/terra-alignment

Conversation

@JustAGhosT
Copy link
Contributor

@JustAGhosT JustAGhosT commented Mar 15, 2026

Pull Request Checklist

Summary

  • What changed?
  • Why was it needed?

Validation

  • Local checks run (if applicable)
  • Relevant workflow/jobs observed

Deployment Notes

  • No environment/config changes required
  • Environment/config changes required (describe below)

Staging Toggle (PRs to main)

  • Add label run-staging to this PR to enable staging deployment (deploy-staging).
  • Remove label run-staging to skip staging deployment.

Risk / Rollback

  • Risk level: low / medium / high
  • Rollback plan:

Summary by CodeRabbit

  • Documentation

    • Added Prometheus telemetry and retention guidance, new API routing docs for Azure OpenAI via LiteLLM, Webhook Auth in gateway docs, multiple formatting/clarity updates across architecture references, and a new AGENTS guide.
  • Improvements

    • Enhanced deployment workflows with per-environment settings and environment selector, configurable smoke-test attempt count, expanded optional state/dashboard and monitoring configuration, and support for mapping additional credentials/environment variables.

- Add C3 connection in container architecture flowchart
- Rename "Azure Monitor" to "Application Insights" in telemetry diagram
- Add Prometheus as a telemetry sink with implementation details
- Document retention policies for Application Insights
- Update matrix gateway JSON example with new field names
- Fix confidence threshold in matrix-gateway from 0.70 to 0.75
- Update SOP suggestion threshold from 0.78 to 0.8
- Fix code block formatting in multiple files
Add environment input parameter to deploy-environment.yaml workflow to specify GitHub environment, improving deployment control and security. Replace hardcoded environment settings in deploy.yaml with the new parameter. Also fix code fences in documentation to use text format and update various documentation details.
…ttempts and update default value

# Pull Request Checklist

## Summary

- What changed?
- Why was it needed?

## Validation

- [ ] Local checks run (if applicable)
- [ ] Relevant workflow/jobs observed

## Deployment Notes

- [ ] No environment/config changes required
- [ ] Environment/config changes required (describe below)

## UAT Toggle (PRs to `main`)

- Add label `run-uat` to this PR to enable UAT deployment (`deploy-uat`).
- Remove label `run-uat` to skip UAT deployment.

## Risk / Rollback

- Risk level: low / medium / high
- Rollback plan:
@blocksorg
Copy link

blocksorg bot commented Mar 15, 2026

Mention Blocks like a regular teammate with your question or request:

@blocks review this pull request
@blocks make the following changes ...
@blocks create an issue from what was mentioned in the following comment ...
@blocks explain the following code ...
@blocks are there any security or performance concerns?

Run @blocks /help for more information.

Workspace settings | Disable this message

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@coderabbitai
Copy link

coderabbitai bot commented Mar 15, 2026

📝 Walkthrough

Walkthrough

Updates CI/CD workflows to add new inputs, per-job environments, secrets, and TF_VAR mappings (Azure auth, Terraform backend, state/dashboard images, Grafana, EXPECTED_AOAI_ENDPOINT_HOST), renames smoke test wait parameter, and applies related infra and documentation adjustments.

Changes

Cohort / File(s) Summary
GitHub Actions: deploy workflows
\.github/workflows/deploy-environment.yaml, \.github/workflows/deploy.yaml
Added inputs (smoke_models_wait_attempts, environment, state/dashboard TF_VARs), new per-job secrets (AZURE_, TF_BACKEND_, EXPECTED_AOAI_ENDPOINT_HOST, STATE_SERVICE_*, DASHBOARD/GRAFANA), mapped env vars/TF_VARs, set job-level environment, and updated smoke-test step to use the new attempts input.
Infra — Application Insights output
infra/modules/aigateway_aca/outputs.tf
Replaced sensitive application_insights_connection_string output with non-sensitive application_insights_name; guidance updated to retrieve connection string from Key Vault.
Infra — state/dashboard token trim change
infra/modules/dashboard_aca/main.tf, infra/modules/state_service_aca/main.tf
Changed whitespace trimming from trim to trimspace for use_shared_token checks, altering inclusion logic for shared-token secrets/env.
Infra env vars
infra/env/dev/terraform.tfvars
Added state_service_container_image, state_service_registry_username, state_service_registry_password assignments for dev environment.
Docs — container & observability
docs/architecture/02-container-architecture.md, docs/architecture/04-observability-telemetry.md
Added Webhook Auth node/edges to gateway diagram; introduced Prometheus as a telemetry sink with /metrics callbacks and retention guidance for Application Insights (Prod 90d, Non-prod 30d).
Docs — systems & routing
docs/architecture/systems/*, docs/architecture/systems/ai-gateway.md
Standardized code fence language to text and added a v1 API routing section describing Azure OpenAI via LiteLLM and routing rules for /v1/responses and /v1/embeddings.
Docs — reference & policies
docs/architecture/reference/*, docs/architecture/reference/strategic/07-deployment-model.md
Updated example payload fields (request_id, label, recommended_tier), added migration note, tightened thresholds (0.70→0.75, 0.78→0.80), expanded fallback pattern, and converted code fences to text.
Docs — planning
docs/planning/request_to_token_attribution.md
Terminology and labeling updates (Required→Recommended, Method A → Preferred), clarified Application Insights wiring via Key Vault, and updated action items.
Docs — minor formatting/others
docs/architecture/reference/slm-*, docs/architecture/systems/*, docs/architecture/systems/codeflow-engine.md
Converted several code fences to text, small formatting tweaks, and minor content adjustments (escaping braces, wording changes).
Ops guide
AGENTS.md
Added AGENTS.md with developer guidance for AI coding agents, tech stack, build/test commands, and style conventions.

Sequence Diagram(s)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐇 I hop through YAML, neat and spry,
Secrets tucked where pipelines lie,
Grafana hums and metrics chime,
Key Vault keeps the strings in time,
I nibble bugs and push — goodbye! 🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description only contains an empty template with no substantive content answering what changed, why it was needed, or addressing validation and risk assessment. Fill in the checklist sections with concrete details about what changed, why changes were needed, validation steps performed, and risk assessment for the deployment.
Title check ❓ Inconclusive The title 'Fix/terra alignment' is vague and uses generic phrasing that doesn't clearly convey what specific alignment issue is being fixed. Provide a more specific and descriptive title that clearly identifies the main change (e.g., 'Align Terraform configuration and GitHub workflow variables' or similar).
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/terra-alignment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
docs/architecture/reference/matrix-gateway.md (1)

48-49: Document migration impact for renamed response fields.

recommended_tier/cacheable look consistent here, but this is a contract change for clients. Please add a short migration note (old → new fields, deprecation window/version) in this doc section to reduce integration risk.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/reference/matrix-gateway.md` around lines 48 - 49, Add a
short migration note in the Matrix Gateway reference section describing the
rename/contract change for the response fields: explicitly state the old field
names (e.g., if any) and the new fields recommended_tier and cacheable, show
example mapping (old → new), and declare the deprecation window/version and
recommended upgrade steps for clients; update the section near the response
schema so consumers know the change, timeline, and how to handle backwards
compatibility (e.g., fallback logic or versioned endpoints).
docs/architecture/02-container-architecture.md (1)

60-60: Model webhook verification explicitly in the flow.

Line 60 introduces direct GitHub Webhook ingress; consider adding a distinct “signature verification/auth” step in the diagram so trust boundaries are explicit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/02-container-architecture.md` at line 60, The diagram
currently connects C3 directly to G1 via "C3 --> G1"; insert an explicit
verification/auth step between them (e.g., create a node like "W_Verify" or
"WebhookAuth" and change the flow to "C3 --> WebhookAuth --> G1") and label that
node to indicate signature validation/authorization so the trust boundary for
GitHub webhook ingress is clear in the diagram and documentation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/deploy-environment.yaml:
- Around line 31-35: Add a missing input declaration for smoke_models_wait_sleep
to the reusable workflow inputs so inputs.smoke_models_wait_sleep is defined;
specifically add an input block named smoke_models_wait_sleep (e.g., required:
false, type: string, default: "10", description: Sleep seconds between model
availability checks) alongside the existing smoke_models_wait_attempts
declaration so callers that pass smoke_models_wait_sleep continue to work.

In `@docs/architecture/04-observability-telemetry.md`:
- Line 79: The description reverses Prometheus' role: update the sentence that
says "The Prometheus exporter exposes a `/metrics` endpoint that scrapes
application metrics" to indicate that Prometheus scrapes the `/metrics` endpoint
(e.g., "The Prometheus exporter exposes a `/metrics` endpoint which is scraped
by Prometheus") and ensure the surrounding text still references
`success_callback`/`failure_callback` containing "prometheus" and the container
config at `infra/modules/aigateway_aca/main.tf:95-113`.

---

Nitpick comments:
In `@docs/architecture/02-container-architecture.md`:
- Line 60: The diagram currently connects C3 directly to G1 via "C3 --> G1";
insert an explicit verification/auth step between them (e.g., create a node like
"W_Verify" or "WebhookAuth" and change the flow to "C3 --> WebhookAuth --> G1")
and label that node to indicate signature validation/authorization so the trust
boundary for GitHub webhook ingress is clear in the diagram and documentation.

In `@docs/architecture/reference/matrix-gateway.md`:
- Around line 48-49: Add a short migration note in the Matrix Gateway reference
section describing the rename/contract change for the response fields:
explicitly state the old field names (e.g., if any) and the new fields
recommended_tier and cacheable, show example mapping (old → new), and declare
the deprecation window/version and recommended upgrade steps for clients; update
the section near the response schema so consumers know the change, timeline, and
how to handle backwards compatibility (e.g., fallback logic or versioned
endpoints).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e51ac272-e68f-4e8e-a52c-d8a36251d57f

📥 Commits

Reviewing files that changed from the base of the PR and between 2bc7860 and 0e4fff2.

📒 Files selected for processing (16)
  • .github/workflows/deploy-environment.yaml
  • .github/workflows/deploy.yaml
  • docs/architecture/02-container-architecture.md
  • docs/architecture/04-observability-telemetry.md
  • docs/architecture/reference/matrix-gateway.md
  • docs/architecture/reference/matrix-rooivalk.md
  • docs/architecture/reference/slm-implementation-matrix.md
  • docs/architecture/reference/slm-management-plan.md
  • docs/architecture/reference/strategic/07-deployment-model.md
  • docs/architecture/systems/agentkit-forge.md
  • docs/architecture/systems/ai-gateway.md
  • docs/architecture/systems/codeflow-engine.md
  • docs/architecture/systems/cognitive-mesh.md
  • docs/architecture/systems/phoenix-rooivalk.md
  • docs/planning/request_to_token_attribution.md
  • infra/modules/aigateway_aca/outputs.tf

- Add new workflow parameter to control sleep between model availability checks
- Update container architecture diagram to include Webhook Auth component
- Improve Prometheus metrics documentation clarity
- Add migration note for Matrix Gateway response contract changes in v1.0.0
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
infra/modules/state_service_aca/main.tf (1)

46-52: Consider trimming the secret value for consistency.

The use_shared_token check now correctly uses trimspace() to detect non-empty tokens, but the secret value itself (line 50) still stores the raw var.state_service_shared_token which may contain trailing newlines. This could cause authentication mismatches if the consuming service performs a strict comparison.

♻️ Proposed fix to trim the secret value
   dynamic "secret" {
     for_each = local.use_shared_token ? [1] : []
     content {
       name  = "state-service-shared-token"
-      value = var.state_service_shared_token
+      value = trimspace(var.state_service_shared_token)
     }
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infra/modules/state_service_aca/main.tf` around lines 46 - 52, The secret
value in the dynamic "secret" block uses the raw var.state_service_shared_token
which may include trailing whitespace/newlines; update the content.value to pass
the variable through trimspace (i.e., use
trimspace(var.state_service_shared_token)) so the stored secret is normalized,
matching the earlier use of trimspace() in local.use_shared_token and avoiding
auth mismatches.
infra/modules/dashboard_aca/main.tf (1)

37-43: Same concern: trim the secret value for consistency.

As noted in infra/modules/state_service_aca/main.tf, the secret value should also be trimmed to avoid storing whitespace-padded tokens.

♻️ Proposed fix to trim the secret value
   dynamic "secret" {
     for_each = local.use_shared_token ? [1] : []
     content {
       name  = "state-service-shared-token"
-      value = var.state_service_shared_token
+      value = trimspace(var.state_service_shared_token)
     }
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infra/modules/dashboard_aca/main.tf` around lines 37 - 43, The dynamic
"secret" block that creates the "state-service-shared-token" currently uses
value = var.state_service_shared_token and may store whitespace-padded tokens;
update the content to use Terraform's trimspace function (value =
trimspace(var.state_service_shared_token)) so the stored secret is normalized;
this change should be made inside the dynamic "secret" block that is conditional
on local.use_shared_token to match the fix in the state_service_aca module.
.github/workflows/deploy-environment.yaml (2)

109-109: Hardcoded secrets expiration date may require future maintenance.

The value "2027-03-31T00:00:00Z" is hardcoded in the workflow. While currently valid (per the Terraform validation in infra/env/dev/variables.tf:88-95), this will require a workflow update before March 2027 to avoid deployment failures when the date is no longer in the future.

Consider parameterizing this as an input or secret to allow environment-specific configuration without workflow changes.

♻️ Suggested parameterization
+      secrets_expiration_date:
+        required: false
+        type: string
+        default: "2027-03-31T00:00:00Z"
+        description: Expiration date for Key Vault secrets (ISO-8601 UTC format)
...
-  TF_VAR_secrets_expiration_date: "2027-03-31T00:00:00Z"
+  TF_VAR_secrets_expiration_date: ${{ inputs.secrets_expiration_date }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/deploy-environment.yaml at line 109, The workflow
hardcodes TF_VAR_secrets_expiration_date which will become invalid in 2027;
change it to a configurable input or secret so it can be updated without editing
the workflow: replace the literal "2027-03-31T00:00:00Z" with a GitHub Actions
input (e.g., workflow_dispatch/input) or a repo/organization secret and fall
back to a computed value if not provided; update any usages referencing
TF_VAR_secrets_expiration_date accordingly so the workflow reads the value from
inputs or secrets instead of the hardcoded string.

110-110: Consider pinning dashboard image version in fallback.

The fallback ghcr.io/phoenixvc/ai-gateway-dashboard:latest uses the :latest tag, which can lead to unpredictable behavior if the image changes between deployments. While this is only used when DASHBOARD_CONTAINER_IMAGE secret is not configured, consider using a pinned version or SHA digest for reproducibility.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/deploy-environment.yaml at line 110, The fallback for
TF_VAR_dashboard_container_image currently uses the mutable tag ':latest';
update the fallback value used when DASHBOARD_CONTAINER_IMAGE is not provided to
a pinned image tag or SHA digest (e.g., replace
'ghcr.io/phoenixvc/ai-gateway-dashboard:latest' with a specific version tag or
`@sha256` digest) so deployments are reproducible; modify the
TF_VAR_dashboard_container_image assignment in the workflow to reference the
pinned tag/digest instead of ':latest' while keeping the secret override via
DASHBOARD_CONTAINER_IMAGE.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/deploy-environment.yaml:
- Line 109: The workflow hardcodes TF_VAR_secrets_expiration_date which will
become invalid in 2027; change it to a configurable input or secret so it can be
updated without editing the workflow: replace the literal "2027-03-31T00:00:00Z"
with a GitHub Actions input (e.g., workflow_dispatch/input) or a
repo/organization secret and fall back to a computed value if not provided;
update any usages referencing TF_VAR_secrets_expiration_date accordingly so the
workflow reads the value from inputs or secrets instead of the hardcoded string.
- Line 110: The fallback for TF_VAR_dashboard_container_image currently uses the
mutable tag ':latest'; update the fallback value used when
DASHBOARD_CONTAINER_IMAGE is not provided to a pinned image tag or SHA digest
(e.g., replace 'ghcr.io/phoenixvc/ai-gateway-dashboard:latest' with a specific
version tag or `@sha256` digest) so deployments are reproducible; modify the
TF_VAR_dashboard_container_image assignment in the workflow to reference the
pinned tag/digest instead of ':latest' while keeping the secret override via
DASHBOARD_CONTAINER_IMAGE.

In `@infra/modules/dashboard_aca/main.tf`:
- Around line 37-43: The dynamic "secret" block that creates the
"state-service-shared-token" currently uses value =
var.state_service_shared_token and may store whitespace-padded tokens; update
the content to use Terraform's trimspace function (value =
trimspace(var.state_service_shared_token)) so the stored secret is normalized;
this change should be made inside the dynamic "secret" block that is conditional
on local.use_shared_token to match the fix in the state_service_aca module.

In `@infra/modules/state_service_aca/main.tf`:
- Around line 46-52: The secret value in the dynamic "secret" block uses the raw
var.state_service_shared_token which may include trailing whitespace/newlines;
update the content.value to pass the variable through trimspace (i.e., use
trimspace(var.state_service_shared_token)) so the stored secret is normalized,
matching the earlier use of trimspace() in local.use_shared_token and avoiding
auth mismatches.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cfbc39a6-c9cd-4b1a-819a-6b28a9ebaa94

📥 Commits

Reviewing files that changed from the base of the PR and between 0e4fff2 and 19759df.

📒 Files selected for processing (6)
  • .github/workflows/deploy-environment.yaml
  • docs/architecture/02-container-architecture.md
  • docs/architecture/04-observability-telemetry.md
  • docs/architecture/reference/matrix-gateway.md
  • infra/modules/dashboard_aca/main.tf
  • infra/modules/state_service_aca/main.tf
🚧 Files skipped from review as they are similar to previous changes (2)
  • docs/architecture/reference/matrix-gateway.md
  • docs/architecture/02-container-architecture.md

Add comprehensive documentation for AI coding agents working in this repository, including:
- Project overview and tech stack
- Build/lint/test commands for each component
- Code style guidelines for Python, JavaScript, and Terraform
- Architecture overview and key files
- Prerequisites and pre-commit checks
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@AGENTS.md`:
- Around line 186-197: The fenced code block containing the directory tree
starting with "docs/architecture/" is missing a language identifier and will
fail MD040; edit the block in AGENTS.md that begins with the triple backticks
before "docs/architecture/" and add a language tag (e.g., text) so the fence
reads ```text, leaving the directory content unchanged, then save the file to
satisfy the linter.
- Around line 78-79: The guidance contradicts itself: the sentence "Use absolute
imports within packages" conflicts with the example that uses a relative import
(from .routes import router). Update AGENTS.md so the rule and example
match—either change the rule to recommend relative imports within packages and
keep the example referencing .routes and router, or replace the example with an
absolute import form that references the package-level routes module and router
symbol; ensure the grouping guideline (stdlib → third-party → local) remains
consistent with the chosen import style.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d30d3a92-61be-43e3-bc4b-f5538a5e6c10

📥 Commits

Reviewing files that changed from the base of the PR and between 19759df and 9be91e5.

📒 Files selected for processing (1)
  • AGENTS.md

Comment on lines +78 to +79
- Use absolute imports within packages: `from .routes import router`
- Group imports: stdlib → third-party → local
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix contradictory Python import guidance.

The text says to use absolute imports, but the example is a relative import (from .routes import router). Please align the rule and example to the same convention.

Suggested doc fix
-- Use absolute imports within packages: `from .routes import router`
+- Use relative imports within packages: `from .routes import router`
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Use absolute imports within packages: `from .routes import router`
- Group imports: stdlib → third-party → local
- Use relative imports within packages: `from .routes import router`
- Group imports: stdlib → third-party → local
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@AGENTS.md` around lines 78 - 79, The guidance contradicts itself: the
sentence "Use absolute imports within packages" conflicts with the example that
uses a relative import (from .routes import router). Update AGENTS.md so the
rule and example match—either change the rule to recommend relative imports
within packages and keep the example referencing .routes and router, or replace
the example with an absolute import form that references the package-level
routes module and router symbol; ensure the grouping guideline (stdlib →
third-party → local) remains consistent with the chosen import style.

Comment on lines +186 to +197
```
docs/architecture/
├── systems/ # Individual system documentation
├── reference/ # Reference and planning docs
│ └── strategic/ # Strategic guidance
├── 01-*-*.md # ADR-style documents

dashboard/ # Admin UI (Node.js/pnpm)
infra/ # Terraform IaC
scripts/ # Deployment automation
state-service/ # FastAPI state service
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a language identifier to the architecture code fence.

This fenced block is missing a language tag and will fail MD040 lint checks.

Suggested doc fix
-```
+```text
 docs/architecture/
 ├── systems/          # Individual system documentation
 ├── reference/        # Reference and planning docs
 │   └── strategic/   # Strategic guidance
 ├── 01-*-*.md       # ADR-style documents
@@
 state-service/      # FastAPI state service
</details>

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.21.0)</summary>

[warning] 186-186: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @AGENTS.md around lines 186 - 197, The fenced code block containing the
directory tree starting with "docs/architecture/" is missing a language
identifier and will fail MD040; edit the block in AGENTS.md that begins with the
triple backticks before "docs/architecture/" and add a language tag (e.g., text)
so the fence reads ```text, leaving the directory content unchanged, then save
the file to satisfy the linter.


</details>

<!-- fingerprinting:phantom:triton:hawk -->

<!-- This is an auto-generated comment by CodeRabbit -->

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
infra/env/dev/terraform.tfvars (1)

28-28: Avoid using :latest tag for container images in infrastructure code.

The :latest tag is non-deterministic and can lead to unexpected behavior when the image is updated. Consider pinning to a specific version or SHA digest for reproducible deployments.

♻️ Suggested improvement
-state_service_container_image   = "ghcr.io/phoenixvc/ai-gateway-state-service:latest"
+state_service_container_image   = "ghcr.io/phoenixvc/ai-gateway-state-service:v1.0.0"  # Pin to specific version
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infra/env/dev/terraform.tfvars` at line 28, The image pinning uses a
non-deterministic :latest tag for state_service_container_image; replace the
"ghcr.io/phoenixvc/ai-gateway-state-service:latest" value in terraform.tfvars
with a specific immutable reference (either a versioned tag like v1.2.3 or a SHA
digest e.g. `@sha256`:...) to ensure reproducible deployments and update any
deployment automation that consumes state_service_container_image to accept the
pinned tag; reference the state_service_container_image variable when making the
change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@infra/env/dev/terraform.tfvars`:
- Around line 29-30: Remove the hardcoded credentials by deleting the
state_service_registry_username and state_service_registry_password assignments
from terraform.tfvars (they are currently set to "phoenixvc" and "ghp_xxx");
rely on the module's conditional auth flag use_registry_auth and supply
credentials at runtime via environment variables (e.g., export
TF_VAR_state_service_registry_username and
TF_VAR_state_service_registry_password in CI/CD) so no secrets are committed to
the repo.

---

Nitpick comments:
In `@infra/env/dev/terraform.tfvars`:
- Line 28: The image pinning uses a non-deterministic :latest tag for
state_service_container_image; replace the
"ghcr.io/phoenixvc/ai-gateway-state-service:latest" value in terraform.tfvars
with a specific immutable reference (either a versioned tag like v1.2.3 or a SHA
digest e.g. `@sha256`:...) to ensure reproducible deployments and update any
deployment automation that consumes state_service_container_image to accept the
pinned tag; reference the state_service_container_image variable when making the
change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d83fd7b1-d5b5-4046-87d4-65b3a4222721

📥 Commits

Reviewing files that changed from the base of the PR and between 9be91e5 and 72c004c.

📒 Files selected for processing (1)
  • infra/env/dev/terraform.tfvars

Comment on lines +29 to +30
state_service_registry_username = "phoenixvc"
state_service_registry_password = "ghp_xxx"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Search for potential hardcoded credentials in tfvars files
# Expected: No matches for credential patterns like ghp_, password =, token =, secret =

echo "=== Checking for GitHub PAT patterns ==="
rg -n 'ghp_[a-zA-Z0-9_]+' --glob '*.tfvars'

echo "=== Checking for password/token assignments with non-empty values ==="
rg -nP '(password|token|secret)\s*=\s*"[^"]+[^"]+"' --glob '*.tfvars' | grep -v '""'

Repository: phoenixvc/ai-gateway

Length of output: 330


🏁 Script executed:

#!/bin/bash

# Find all .tf files in infra directory
echo "=== Finding all .tf files in infra directory ==="
find infra -name "*.tf" -type f

echo ""
echo "=== Searching for state_service_registry_password variable definition ==="
rg -n 'state_service_registry_password' --type tf -B 2 -A 2

echo ""
echo "=== Searching for state_service_registry_username variable definition ==="
rg -n 'state_service_registry_username' --type tf -B 2 -A 2

Repository: phoenixvc/ai-gateway

Length of output: 5517


🏁 Script executed:

#!/bin/bash

echo "=== Reading dev/variables.tf for state_service_registry variables ==="
sed -n '189,210p' infra/env/dev/variables.tf

echo ""
echo "=== Checking if there are any defaults in these variable blocks ==="
rg -A 10 'variable "state_service_registry_(username|password)"' infra/env/dev/variables.tf

Repository: phoenixvc/ai-gateway

Length of output: 1330


🏁 Script executed:

#!/bin/bash

echo "=== Reading state_service_aca module main.tf to see credential usage ==="
cat -n infra/modules/state_service_aca/main.tf

echo ""
echo "=== Checking state_service_aca variables.tf ==="
cat -n infra/modules/state_service_aca/variables.tf

Repository: phoenixvc/ai-gateway

Length of output: 6408


Remove hardcoded credentials from terraform.tfvars.

The state_service_registry_password value (ghp_xxx) appears to be a GitHub Personal Access Token placeholder. Hardcoding credentials in tfvars files committed to version control is a security risk:

  1. The pattern may be mistaken for a real (redacted) token
  2. Developers may replace it with real credentials and accidentally commit
  3. These variables have defaults of "" and the module gracefully handles empty credentials via conditional logic (use_registry_auth), so they can be safely omitted

Remove these values from the tfvars file and provide them at runtime via CI/CD environment variables (e.g., TF_VAR_state_service_registry_password).

Proposed fix
 # State Service
 state_service_container_image   = "ghcr.io/phoenixvc/ai-gateway-state-service:latest"
-state_service_registry_username = "phoenixvc"
-state_service_registry_password = "ghp_xxx"
+# Registry credentials should be provided via TF_VAR_* environment variables
+# state_service_registry_username = ""  # Set via TF_VAR_state_service_registry_username
+# state_service_registry_password = ""  # Set via TF_VAR_state_service_registry_password (sensitive)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
state_service_registry_username = "phoenixvc"
state_service_registry_password = "ghp_xxx"
# State Service
state_service_container_image = "ghcr.io/phoenixvc/ai-gateway-state-service:latest"
# Registry credentials should be provided via TF_VAR_* environment variables
# state_service_registry_username = "" # Set via TF_VAR_state_service_registry_username
# state_service_registry_password = "" # Set via TF_VAR_state_service_registry_password (sensitive)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infra/env/dev/terraform.tfvars` around lines 29 - 30, Remove the hardcoded
credentials by deleting the state_service_registry_username and
state_service_registry_password assignments from terraform.tfvars (they are
currently set to "phoenixvc" and "ghp_xxx"); rely on the module's conditional
auth flag use_registry_auth and supply credentials at runtime via environment
variables (e.g., export TF_VAR_state_service_registry_username and
TF_VAR_state_service_registry_password in CI/CD) so no secrets are committed to
the repo.

@JustAGhosT
Copy link
Contributor Author

@claude help here Verify each finding against the current code and only fix it if needed.

Inline comments:
In @infra/env/dev/terraform.tfvars:

  • Around line 29-30: Remove the hardcoded credentials by deleting the
    state_service_registry_username and state_service_registry_password assignments
    from terraform.tfvars (they are currently set to "phoenixvc" and "ghp_xxx");
    rely on the module's conditional auth flag use_registry_auth and supply
    credentials at runtime via environment variables (e.g., export
    TF_VAR_state_service_registry_username and
    TF_VAR_state_service_registry_password in CI/CD) so no secrets are committed to
    the repo.

Nitpick comments:
In @infra/env/dev/terraform.tfvars:

  • Line 28: The image pinning uses a non-deterministic :latest tag for
    state_service_container_image; replace the
    "ghcr.io/phoenixvc/ai-gateway-state-service:latest" value in terraform.tfvars
    with a specific immutable reference (either a versioned tag like v1.2.3 or a SHA
    digest e.g. @sha256:...) to ensure reproducible deployments and update any
    deployment automation that consumes state_service_container_image to accept the
    pinned tag; reference the state_service_container_image variable when making the
    change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant