Skip to content

Eai 5821 evaluate envoy gateway#705

Draft
johnl-amd wants to merge 22 commits intomainfrom
EAI_5821_evaluate_envoy_gateway
Draft

Eai 5821 evaluate envoy gateway#705
johnl-amd wants to merge 22 commits intomainfrom
EAI_5821_evaluate_envoy_gateway

Conversation

@johnl-amd
Copy link
Copy Markdown

Validated locally (local Kind cluster, envoy-gateway branch):

  • Login flow via Keycloak completes correctly through envoy-gateway
  • AIRM and AIWB UIs accessible and functional
  • Model deployment via AIWB creates an HTTPRoute in the workbench namespace — aim-engine clusterRuntimeConfig
    is working correctly
  • Gateway accepts the HTTPRoute once the predictor pod becomes ready

Change added in this branch:

  • Enabled clusterRuntimeConfig in aim-engine values (sources/aim-engine/0.2.2/values.yaml) pointing at
    envoy-gateway-system — without this, HTTPRoutes for deployed models were not being created

Still to validate on hosted cluster:

  • HTTPRoute acceptance with real TLS and real DNS
  • Model inference traffic actually reaches the predictor through the gateway
  • Any existing deployed models/InferenceServices pick up routing correctly after the config is applied

woojae-siloai and others added 22 commits April 23, 2026 11:28
- Update all HTTPRoute parentRefs from kgateway-system to envoy-gateway-system
- ArgoCD, Gitea, Openbao, Keycloak, and MinIO HTTPRoutes updated
- Maintains exact functionality while pointing to new envoy-gateway location
- Part of kgateway to envoy-gateway v1.7.1 migration

Files updated:
- sources/argocd-config/http-route.yaml
- sources/gitea-config/templates/gitea-httproute.yaml
- sources/keycloak-config/templates/keycloak-httproute.yaml
- sources/minio-tenant-config/templates/minio-httproute.yaml
- sources/openbao-config/0.1.0/templates/openbao-httproute.yaml
- Add explicit Namespace resource creation before other resources
- Remove redundant namespace creation logic from script
- Add proper error handling for missing namespace
- Ensures namespace exists before ServiceAccount and Job creation
@johnl-amd
Copy link
Copy Markdown
Author

Validated this branch locally on a Kind cluster with envoy-gateway. Everything works end-to-end. Added one commit to enable clusterRuntimeConfig in aim-engine so HTTPRoutes for deployed models point at envoy-gateway-system instead of the default kgateway-system.

A few questions while reviewing the diff:

1. Duplicate SecurityPolicy for ExtAuth
Both sources/envoy-gateway-config/templates/security-policy-extauth.yaml and sources/cluster-auth/0.5.0/templates/security-policy-extauth.yaml create a SecurityPolicy targeting the same https gateway. One has cluster-auth hardcoded, the other uses the Helm template name. Which one is intentional? Having two competing auth policies on the same gateway could cause unexpected behavior.

2. TLS migration jobs are not in ArgoCD
job-cluster-tls-copy.yaml and job-cluster-tls-migration.yaml are raw manifests at the repo root — not part of any Helm chart or ArgoCD application. They won't run automatically on deployment. Is there a manual step required here that should be documented?

3. TLS jobs assume migration from kgateway
Both jobs check if cluster-tls exists in kgateway-system and exit successfully if it doesn't. On a fresh cluster that never had kgateway, envoy-gateway ends up with no TLS certificate. Is there a separate mechanism for fresh installs, or is this branch only intended for migrating existing clusters?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants