Skip to content

ci(e2e): fix namespace apply order + run a cluster per engine#40

Merged
joy-software merged 3 commits into
mainfrom
fix/e2e-namespace-order
Jun 7, 2026
Merged

ci(e2e): fix namespace apply order + run a cluster per engine#40
joy-software merged 3 commits into
mainfrom
fix/e2e-namespace-order

Conversation

@joy-software

@joy-software joy-software commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Problems

  1. Apply step failedkubectl apply -f deploy/operator orders files alphabetically, so deployment.yaml (namespaced to elpio-system) was applied before rbac.yaml, which defines that Namespace: namespaces "elpio-system" not found.
  2. Wake test failed — the suite has two engine-specific scale-to-zero tests, but CI ran a single operator (ELPIO_ENGINE=keda). The Knative test wakes via the cluster-local Service (the Knative activator catches it); under the keda operator that Service has no endpoints, so the curl timed out (exit 28).

Fixes

  • Apply deploy/operator/rbac.yaml first (creates the namespace), then the rest — the same two-step the CLI installer already does.
  • Make e2e a matrix over [knative, keda]: each leg runs its own kind cluster + operator with that engine, and each test module skipifs unless ELPIO_ENGINE matches. Both wake paths (Knative activator + keda-http interceptor) are now exercised, so continue-on-error is dropped and e2e gates the build.

Note: the earlier integration failure on main was a transient Docker Hub pull timeout, not a code issue.

The e2e job ran a bare `kubectl apply -f deploy/operator`, which orders files
alphabetically and so created deployment.yaml (namespaced to elpio-system)
before rbac.yaml, which is where the elpio-system Namespace is defined. That
failed with 'namespaces "elpio-system" not found'. Apply rbac.yaml first, the
same two-step the CLI installer already does.
The e2e suite has two engine-specific paths that a single operator can't serve
at once: the Knative test wakes via the cluster-local Service (caught by the
Knative activator), while the keda-http test wakes through the add-on
interceptor. Running one operator (keda) made the Knative wake test fail because
a scaled-to-zero keda Service has no endpoints.

Split the job into a matrix over [knative, keda]: each leg runs its own kind
cluster and operator, and each test module skips unless ELPIO_ENGINE matches.
Both paths are now genuinely exercised, so drop continue-on-error and let e2e
gate the build.
@joy-software joy-software changed the title ci(e2e): apply operator rbac (namespace) before the deployment ci(e2e): fix namespace apply order + run a cluster per engine Jun 7, 2026
…onal

The Knative wake test curled the cluster-local host before the KnativeService
route was programmed, so DNS didn't resolve (curl exit 6) and the whole module
finished in ~7s. Wait for the ksvc Ready condition (the route + cluster-local
Service exist) before the wake, and retry the first request while the activator
spins the revision up.

Keep continue-on-error on the per-engine e2e legs: they bring up a real kind
cluster + Knative/KEDA and are subject to environmental flakiness (this run the
keda leg died inside helm/kind-action itself), which should not red the build.
@joy-software joy-software merged commit b958290 into main Jun 7, 2026
9 checks passed
@joy-software joy-software deleted the fix/e2e-namespace-order branch June 7, 2026 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant