From 937fd5c4e9434ec6f227351c29b6eaf17a5ddcdb Mon Sep 17 00:00:00 2001 From: Davide Chiarelli Date: Tue, 3 Mar 2026 10:28:01 +0100 Subject: [PATCH 1/7] feat: add pre-deploy job filtering and related resources - Introduced a new pre-deploy filter to handle jobs annotated for pre-deploy execution. - Added multiple test cases to validate the filtering logic for pre-deploy jobs. - Created example ConfigMap and Deployment resources for testing purposes. - Implemented various pre-deploy job configurations, including optional and failing jobs. - Enhanced the test suite with comprehensive tests for pre-deploy job extraction and filtering. --- docs/10_overview.md | 1 + docs/60_pre_deploy.md | 121 ++++ pkg/cmd/deploy/deploy.go | 82 ++- pkg/cmd/deploy/deploy_test.go | 24 +- pkg/cmd/deploy/predeploy.go | 263 ++++++++ pkg/cmd/deploy/predeploy_test.go | 624 ++++++++++++++++++ .../deploy/testdata/pre-deploy/configmap.yaml | 6 + .../testdata/pre-deploy/deployment.yaml | 32 + .../pre-deploy-job-custom-annotation.yaml | 18 + .../pre-deploy/pre-deploy-job-failed.yaml | 20 + .../pre-deploy-job-optional-failed.yaml | 20 + .../pre-deploy/pre-deploy-job-optional.yaml | 20 + .../testdata/pre-deploy/pre-deploy-job.yaml | 18 + pkg/extensions/predeployfilter.go | 106 +++ pkg/extensions/predeployfilter_test.go | 292 ++++++++ .../testdata/filter/pre-deploy-job.yaml | 18 + .../testdata/filter/regular-job.yaml | 16 + 17 files changed, 1650 insertions(+), 31 deletions(-) create mode 100644 docs/60_pre_deploy.md create mode 100644 pkg/cmd/deploy/predeploy.go create mode 100644 pkg/cmd/deploy/predeploy_test.go create mode 100644 pkg/cmd/deploy/testdata/pre-deploy/configmap.yaml create mode 100644 pkg/cmd/deploy/testdata/pre-deploy/deployment.yaml create mode 100644 pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-custom-annotation.yaml create mode 100644 pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-failed.yaml create mode 100644 pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml create mode 100644 pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml create mode 100644 pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job.yaml create mode 100644 pkg/extensions/predeployfilter.go create mode 100644 pkg/extensions/predeployfilter_test.go create mode 100644 pkg/extensions/testdata/filter/pre-deploy-job.yaml create mode 100644 pkg/extensions/testdata/filter/regular-job.yaml diff --git a/docs/10_overview.md b/docs/10_overview.md index f6835e0..cf1fc9a 100644 --- a/docs/10_overview.md +++ b/docs/10_overview.md @@ -39,3 +39,4 @@ Below, you can find additional documentation for `mlp`: - [Generation Configuration](./30_generate.md) - [Hydration Logic](./40_hydrate.md) - [Interpolatation Template](./50_interpolate.md) +- [Pre-Deploy Jobs](./60_pre_deploy.md) diff --git a/docs/60_pre_deploy.md b/docs/60_pre_deploy.md new file mode 100644 index 0000000..7ca6825 --- /dev/null +++ b/docs/60_pre_deploy.md @@ -0,0 +1,121 @@ +# Pre-Deploy Jobs + +The `deploy` command now supports executing Jobs marked as pre-deploy jobs before the main deployment pipeline. +This feature allows you to run initialization or validation tasks (such as database migrations, secret provisioning, +or health checks) prior to deploying your main workloads. + +## Identifying Pre-Deploy Jobs + +A Job is identified as a pre-deploy job by annotating it with the `mia-platform.eu/deploy` annotation and setting +its value to `pre-deploy`. This annotation value can be customized via the `--pre-deploy-job-annotation` CLI flag. + +Example Job manifest: + +```yaml +apiVersion: batch/v1 +kind: Job +metadata: + name: db-migration-job + annotations: + mia-platform.eu/deploy: pre-deploy +spec: + template: + spec: + containers: + - name: migrator + image: migrations:latest + restartPolicy: Never +``` + +## Pre-Deploy Job Execution Flow + +When the `deploy` command runs, it performs the following steps: + +1. **Extraction**: Pre-deploy jobs are extracted from the resource list based on the configured annotation. +2. **Execution**: Each pre-deploy job is applied to the cluster in sequence. +3. **Monitoring**: The command waits for each job to complete with a configurable timeout. +4. **Retry Logic**: If a job fails, the command automatically retries the execution up to a maximum number of attempts. +5. **Main Deployment**: After all pre-deploy jobs successfully complete, the main deployment pipeline proceeds. +6. **Filtering**: Pre-deploy jobs are filtered out and do not participate in the main deployment phase. + +## Configuration + +The behavior of pre-deploy jobs can be configured via the following CLI flags: + +### `--pre-deploy-job-timeout` + +The maximum duration to wait for a single pre-deploy job to complete. + +- **Type**: Duration +- **Default**: `30s` +- **Usage**: `mlp deploy --pre-deploy-job-timeout 2m` + +Non-zero values should contain a corresponding time unit (e.g., `1s`, `2m`, `3h`). + +### `--pre-deploy-job-max-retries` + +The maximum number of attempts for executing a pre-deploy job before aborting the deploy. + +- **Type**: Integer +- **Default**: `3` +- **Usage**: `mlp deploy --pre-deploy-job-max-retries 5` + +### `--pre-deploy-job-annotation` + +The value of the `mia-platform.eu/deploy` annotation used to identify pre-deploy jobs. +This allows you to customize the annotation value used to mark jobs as pre-deploy jobs. + +- **Type**: String +- **Default**: `pre-deploy` +- **Usage**: `mlp deploy --pre-deploy-job-annotation init-job` + +## Optional Pre-Deploy Jobs + +Pre-deploy jobs can be marked as optional using the `mia-platform.eu/deploy-optional` annotation set to `true`. +Optional jobs that fail will log a warning but will not block the main deployment pipeline. + +Example: + +```yaml +apiVersion: batch/v1 +kind: Job +metadata: + name: optional-config-job + annotations: + mia-platform.eu/deploy: pre-deploy + mia-platform.eu/deploy-optional: "true" +spec: + template: + spec: + containers: + - name: config + image: config-tool:latest + restartPolicy: Never +``` + +## Error Handling + +### Mandatory Pre-Deploy Job Failure + +If a mandatory pre-deploy job fails after all retry attempts are exhausted, the deploy process is aborted +and an error is returned. This ensures that critical initialization tasks are completed before proceeding +with the main deployment. + +### Optional Pre-Deploy Job Failure + +If an optional pre-deploy job fails, a warning is logged and the deploy process continues. +This allows for graceful handling of non-critical pre-deployment tasks. + +## Dry-Run Mode + +When using the `--dry-run` flag, pre-deploy jobs are applied to the cluster with the dry-run option enabled. +In this mode, job status polling is skipped, and the command does not wait for job completion. + +## Polling Behavior + +The command polls the status of each pre-deploy job every 5 seconds until: + +- The job completes successfully +- The job exceeds the configured timeout +- The maximum number of retries is reached + diff --git a/pkg/cmd/deploy/deploy.go b/pkg/cmd/deploy/deploy.go index 67ee20d..255fa25 100644 --- a/pkg/cmd/deploy/deploy.go +++ b/pkg/cmd/deploy/deploy.go @@ -82,6 +82,17 @@ const ( waitFlagDefaultValue = true waitFlagUsage = "if true, wait for resources to be current before marking them as successfully applied" + preDeployJobTimeoutFlagName = "pre-deploy-job-timeout" + preDeployJobTimeoutDefaultValue = 30 * time.Second + preDeployJobTimeoutFlagUsage = "the length of time to wait before giving up on a single pre-deploy job execution. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h)" + + preDeployJobMaxRetriesFlagName = "pre-deploy-job-max-retries" + preDeployJobMaxRetriesDefaultValue = 3 + preDeployJobMaxRetriesFlagUsage = "the maximum number of attempts for a pre-deploy job before aborting the deploy" + + preDeployJobAnnotationFlagName = "pre-deploy-job-annotation" + preDeployJobAnnotationFlagUsage = "the value of the mia-platform.eu/deploy annotation used to identify pre-deploy jobs" + stdinToken = "-" fieldManager = "mlp" inventoryName = "eu.mia-platform.mlp" @@ -97,25 +108,32 @@ var ( // Flags contains all the flags for the `deploy` command. They will be converted to Options // that contains all runtime options for the command. type Flags struct { - ConfigFlags *genericclioptions.ConfigFlags - inputPaths []string - deployType string - forceDeploy bool - ensureNamespace bool - timeout time.Duration - dryRun bool - wait bool + ConfigFlags *genericclioptions.ConfigFlags + inputPaths []string + deployType string + forceDeploy bool + ensureNamespace bool + timeout time.Duration + preDeployJobTimeout time.Duration + preDeployJobMaxRetries int + preDeployJobAnnotation string + dryRun bool + wait bool } // Options have the data required to perform the deploy operation type Options struct { - inputPaths []string - deployType string - forceDeploy bool - ensureNamespace bool - timeout time.Duration - dryRun bool - wait bool + inputPaths []string + deployType string + forceDeploy bool + ensureNamespace bool + timeout time.Duration + preDeployJobTimeout time.Duration + preDeployJobMaxRetries int + preDeployJobAnnotation string + preDeployPollingInterval time.Duration + dryRun bool + wait bool clientFactory util.ClientFactory clock clock.PassiveClock @@ -185,6 +203,9 @@ func (f *Flags) AddFlags(flags *pflag.FlagSet) { flags.DurationVar(&f.timeout, timeoutFlagName, timeoutDefaultValue, timeoutFlagUsage) flags.BoolVar(&f.dryRun, dryRunFlagName, dryRunDefaultValue, dryRunFlagUsage) flags.BoolVar(&f.wait, waitFlagName, waitFlagDefaultValue, waitFlagUsage) + flags.DurationVar(&f.preDeployJobTimeout, preDeployJobTimeoutFlagName, preDeployJobTimeoutDefaultValue, preDeployJobTimeoutFlagUsage) + flags.IntVar(&f.preDeployJobMaxRetries, preDeployJobMaxRetriesFlagName, preDeployJobMaxRetriesDefaultValue, preDeployJobMaxRetriesFlagUsage) + flags.StringVar(&f.preDeployJobAnnotation, preDeployJobAnnotationFlagName, extensions.PreDeployFilterDefaultValue, preDeployJobAnnotationFlagUsage) } // ToOptions transform the command flags in command runtime arguments @@ -194,12 +215,15 @@ func (f *Flags) ToOptions(reader io.Reader, writer io.Writer) (*Options, error) } return &Options{ - inputPaths: f.inputPaths, - deployType: f.deployType, - forceDeploy: f.forceDeploy, - ensureNamespace: f.ensureNamespace, - timeout: f.timeout, - wait: f.wait, + inputPaths: f.inputPaths, + deployType: f.deployType, + forceDeploy: f.forceDeploy, + ensureNamespace: f.ensureNamespace, + timeout: f.timeout, + preDeployJobTimeout: f.preDeployJobTimeout, + preDeployJobMaxRetries: f.preDeployJobMaxRetries, + preDeployJobAnnotation: f.preDeployJobAnnotation, + wait: f.wait, clientFactory: util.NewFactory(f.ConfigFlags), reader: reader, @@ -243,6 +267,20 @@ func (o *Options) Run(ctx context.Context) error { return err } + annotationValue := o.preDeployJobAnnotation + if annotationValue == "" { + annotationValue = extensions.PreDeployFilterDefaultValue + } + + preDeployJobs, resources := extensions.ExtractPreDeployJobs(resources, annotationValue) + if len(preDeployJobs) > 0 { + logger.V(3).Info("found pre-deploy jobs", "count", len(preDeployJobs)) + + if err := o.runPreDeployJobs(ctx, preDeployJobs, namespace); err != nil { + return fmt.Errorf("pre-deploy phase failed: %w", err) + } + } + if err := o.ensuringNamespace(ctx, namespace); err != nil { return nil } @@ -260,7 +298,7 @@ func (o *Options) Run(ctx context.Context) error { extensions.NewDeployMutator(o.deployType, o.forceDeploy, extensions.ChecksumFromData(deployIdentifier)), extensions.NewExternalSecretsMutator(resources), ). - WithFilters(extensions.NewDeployOnceFilter()). + WithFilters(extensions.NewDeployOnceFilter(), extensions.NewPreDeployFilter(annotationValue)). WithCustomStatusChecker(extensions.ExternalSecretStatusCheckers()). Build() if err != nil { diff --git a/pkg/cmd/deploy/deploy_test.go b/pkg/cmd/deploy/deploy_test.go index 318bacd..e0fa7e1 100644 --- a/pkg/cmd/deploy/deploy_test.go +++ b/pkg/cmd/deploy/deploy_test.go @@ -95,18 +95,24 @@ func TestOptions(t *testing.T) { configFlags := genericclioptions.NewConfigFlags(false) expectedOpts := &Options{ - inputPaths: []string{"input"}, - deployType: "smart_deploy", - reader: reader, - writer: buffer, - clientFactory: util.NewFactory(configFlags), - clock: clock.RealClock{}, - wait: false, + inputPaths: []string{"input"}, + deployType: "smart_deploy", + preDeployJobTimeout: 5 * time.Minute, + preDeployJobMaxRetries: 3, + preDeployJobAnnotation: "pre-deploy", + reader: reader, + writer: buffer, + clientFactory: util.NewFactory(configFlags), + clock: clock.RealClock{}, + wait: false, } flag := &Flags{ - inputPaths: []string{"input"}, - deployType: "smart_deploy", + inputPaths: []string{"input"}, + deployType: "smart_deploy", + preDeployJobTimeout: 5 * time.Minute, + preDeployJobMaxRetries: 3, + preDeployJobAnnotation: "pre-deploy", } _, err := flag.ToOptions(reader, buffer) assert.ErrorContains(t, err, "config flags are required") diff --git a/pkg/cmd/deploy/predeploy.go b/pkg/cmd/deploy/predeploy.go new file mode 100644 index 0000000..8023885 --- /dev/null +++ b/pkg/cmd/deploy/predeploy.go @@ -0,0 +1,263 @@ +// Copyright Mia srl +// SPDX-License-Identifier: Apache-2.0 +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package deploy + +import ( + "context" + "encoding/json" + "fmt" + "time" + + "github.com/go-logr/logr" + "github.com/mia-platform/mlp/v2/pkg/extensions" + batchv1 "k8s.io/api/batch/v1" + "k8s.io/apimachinery/pkg/api/errors" + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" + "k8s.io/apimachinery/pkg/runtime" + "k8s.io/apimachinery/pkg/types" + "k8s.io/client-go/dynamic" +) + +const ( + preDeployPollingInterval = 5 * time.Second +) + +// runPreDeployJobs applies each pre-deploy job to the cluster and waits for its completion. +// Each job execution has its own timeout (preDeployJobTimeout). If a job fails, it is retried +// up to preDeployJobMaxRetries times. If all retries are exhausted and the job is not marked +// as optional (via the mia-platform.eu/deploy-optional annotation), the deploy process is aborted. +// Optional jobs that fail are logged as warnings and do not block the deploy pipeline. +// In dry-run mode, jobs are applied with the dry-run option and polling is skipped. +func (o *Options) runPreDeployJobs(ctx context.Context, jobs []*unstructured.Unstructured, namespace string) error { + if len(jobs) == 0 { + return nil + } + + logger := logr.FromContextOrDiscard(ctx) + + pollingInterval := preDeployPollingInterval + if o.preDeployPollingInterval > 0 { + pollingInterval = o.preDeployPollingInterval + } + + maxRetries := o.preDeployJobMaxRetries + if maxRetries <= 0 { + maxRetries = 1 + } + + dynamicClient, err := o.clientFactory.DynamicClient() + if err != nil { + return fmt.Errorf("failed to create dynamic client for pre-deploy jobs: %w", err) + } + + mapper, err := o.clientFactory.ToRESTMapper() + if err != nil { + return fmt.Errorf("failed to create REST mapper for pre-deploy jobs: %w", err) + } + + jobGVK := batchv1.SchemeGroupVersion.WithKind("Job") + mapping, err := mapper.RESTMapping(jobGVK.GroupKind(), jobGVK.Version) + if err != nil { + return fmt.Errorf("failed to get REST mapping for Job: %w", err) + } + + jobResource := dynamicClient.Resource(mapping.Resource) + + for _, job := range jobs { + jobName := job.GetName() + jobNamespace := job.GetNamespace() + if jobNamespace == "" { + jobNamespace = namespace + } + + var lastErr error + for attempt := 1; attempt <= maxRetries; attempt++ { + logger.V(3).Info("running pre-deploy job", "name", jobName, "namespace", jobNamespace, "attempt", attempt) + fmt.Fprintf(o.writer, "pre-deploy job %s: starting (attempt %d/%d)\n", jobName, attempt, maxRetries) + + if err := o.applyPreDeployJob(ctx, jobResource, job, jobNamespace); err != nil { + return fmt.Errorf("pre-deploy job %s: failed to apply: %w", jobName, err) + } + + if o.dryRun { + fmt.Fprintf(o.writer, "pre-deploy job %s: applied (dry-run)\n", jobName) + lastErr = nil + break + } + + // Create a per-execution timeout context + execCtx := ctx + if o.preDeployJobTimeout > 0 { + var cancel context.CancelFunc + execCtx, cancel = context.WithTimeout(ctx, o.preDeployJobTimeout) + defer cancel() + } + + lastErr = o.waitForJobCompletion(execCtx, jobResource, jobName, jobNamespace, pollingInterval) + if lastErr == nil { + fmt.Fprintf(o.writer, "pre-deploy job %s: completed successfully\n", jobName) + break + } + + fmt.Fprintf(o.writer, "pre-deploy job %s: attempt %d/%d failed: %s\n", jobName, attempt, maxRetries, lastErr) + } + + if lastErr != nil { + if extensions.IsOptionalPreDeployJob(job) { + fmt.Fprintf(o.writer, "pre-deploy job %s: optional job failed after %d attempt(s), continuing: %s\n", jobName, maxRetries, lastErr) + continue + } + return fmt.Errorf("pre-deploy job %s: failed after %d attempt(s): %w", jobName, maxRetries, lastErr) + } + } + + return nil +} + +// applyPreDeployJob removes any existing job with the same name and applies the new one using server-side apply. +// It enforces backoffLimit=0 so the job fails immediately on the first pod failure without Kubernetes-level retries. +// Higher-level retries are handled by runPreDeployJobs. +func (o *Options) applyPreDeployJob(ctx context.Context, jobResource dynamic.NamespaceableResourceInterface, job *unstructured.Unstructured, namespace string) error { + logger := logr.FromContextOrDiscard(ctx) + + // Deep copy to avoid mutating the original object + job = job.DeepCopy() + // Force backoffLimit=0: on failure the job fails immediately and we handle retries ourselves + if err := unstructured.SetNestedField(job.Object, int64(0), "spec", "backoffLimit"); err != nil { + return fmt.Errorf("failed to set backoffLimit: %w", err) + } + + if !o.dryRun { + // Delete any existing job with the same name to ensure a clean run + propagationPolicy := metav1.DeletePropagationBackground + err := jobResource.Namespace(namespace).Delete(ctx, job.GetName(), metav1.DeleteOptions{ + PropagationPolicy: &propagationPolicy, + }) + if err != nil && !errors.IsNotFound(err) { + return fmt.Errorf("failed to delete existing job: %w", err) + } + + if err == nil { + logger.V(5).Info("deleted existing pre-deploy job", "name", job.GetName(), "namespace", namespace) + } + } + + data, err := json.Marshal(job.Object) + if err != nil { + return fmt.Errorf("failed to marshal job: %w", err) + } + + patchOpts := metav1.PatchOptions{ + FieldManager: fieldManager, + } + if o.dryRun { + patchOpts.DryRun = []string{metav1.DryRunAll} + } + + _, err = jobResource.Namespace(namespace).Patch(ctx, job.GetName(), types.ApplyPatchType, data, patchOpts) + if err != nil { + return err + } + + return nil +} + +// waitForJobCompletion polls the job status until it completes, fails, or the context times out. +// The status is checked immediately after calling this function, then at each polling interval. +func (o *Options) waitForJobCompletion(ctx context.Context, jobResource dynamic.NamespaceableResourceInterface, name, namespace string, pollingInterval time.Duration) error { + logger := logr.FromContextOrDiscard(ctx) + + ticker := time.NewTicker(pollingInterval) + defer ticker.Stop() + + for { + // Check status immediately on each iteration (including the first one) + logger.V(5).Info("polling pre-deploy job status", "name", name, "namespace", namespace) + + obj, err := jobResource.Namespace(namespace).Get(ctx, name, metav1.GetOptions{}) + if err != nil { + return fmt.Errorf("failed to get job status: %w", err) + } + + completed, errMsg, err := checkJobStatus(obj) + if err != nil { + return err + } + + if errMsg != "" { + return fmt.Errorf("job failed: %s", errMsg) + } + + if completed { + return nil + } + + fmt.Fprintf(o.writer, "pre-deploy job %s: waiting for completion\n", name) + + // Wait for next poll interval or context cancellation + select { + case <-ctx.Done(): + return fmt.Errorf("timed out waiting for job completion") + case <-ticker.C: + // continue to next check + } + } +} + +// checkJobStatus examines the job's status conditions to determine if it has completed or failed. +// Returns (completed, failMessage, error) +func checkJobStatus(obj *unstructured.Unstructured) (bool, string, error) { + job := new(batchv1.Job) + if err := runtime.DefaultUnstructuredConverter.FromUnstructured(obj.Object, job); err != nil { + return false, "", fmt.Errorf("failed to convert job status: %w", err) + } + + for _, condition := range job.Status.Conditions { + switch condition.Type { + case batchv1.JobComplete: + if condition.Status == "True" { + return true, "", nil + } + case batchv1.JobFailed: + if condition.Status == "True" { + msg := condition.Message + if msg == "" { + msg = condition.Reason + } + if msg == "" { + msg = "job execution failed" + } + return false, msg, nil + } + } + } + + // Detect failure before the controller sets the Failed condition: + // if there are no active pods and the number of failures exceeds the backoff limit, + // the job has effectively failed + if job.Status.Active == 0 && job.Status.Failed > 0 { + backoffLimit := int32(6) // Kubernetes default + if job.Spec.BackoffLimit != nil { + backoffLimit = *job.Spec.BackoffLimit + } + if job.Status.Failed > backoffLimit { + return false, fmt.Sprintf("job has %d failed pod(s) exceeding backoff limit of %d", job.Status.Failed, backoffLimit), nil + } + } + + return false, "", nil +} diff --git a/pkg/cmd/deploy/predeploy_test.go b/pkg/cmd/deploy/predeploy_test.go new file mode 100644 index 0000000..c218154 --- /dev/null +++ b/pkg/cmd/deploy/predeploy_test.go @@ -0,0 +1,624 @@ +// Copyright Mia srl +// SPDX-License-Identifier: Apache-2.0 +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package deploy + +import ( + "fmt" + "strings" + "testing" + "time" + + jpltesting "github.com/mia-platform/jpl/pkg/testing" + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + batchv1 "k8s.io/api/batch/v1" + "k8s.io/apimachinery/pkg/api/errors" + "k8s.io/apimachinery/pkg/api/meta" + "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" + "k8s.io/apimachinery/pkg/runtime" + "k8s.io/apimachinery/pkg/runtime/schema" + dynamicfake "k8s.io/client-go/dynamic/fake" + k8stesting "k8s.io/client-go/testing" +) + +func TestCheckJobStatus(t *testing.T) { + t.Parallel() + + backoffLimitZero := int32(0) + + tests := map[string]struct { + job *batchv1.Job + expectedDone bool + expectedFailMsg string + expectedError string + }{ + "job without conditions is in progress": { + job: &batchv1.Job{ + Status: batchv1.JobStatus{}, + }, + expectedDone: false, + expectedFailMsg: "", + }, + "job with complete condition is done": { + job: &batchv1.Job{ + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + { + Type: batchv1.JobComplete, + Status: "True", + }, + }, + }, + }, + expectedDone: true, + expectedFailMsg: "", + }, + "job with failed condition returns failure message": { + job: &batchv1.Job{ + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + { + Type: batchv1.JobFailed, + Status: "True", + Message: "BackoffLimitExceeded", + }, + }, + }, + }, + expectedDone: false, + expectedFailMsg: "BackoffLimitExceeded", + }, + "job with failed condition and empty message uses reason": { + job: &batchv1.Job{ + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + { + Type: batchv1.JobFailed, + Status: "True", + Reason: "DeadlineExceeded", + }, + }, + }, + }, + expectedDone: false, + expectedFailMsg: "DeadlineExceeded", + }, + "job with failed condition and empty message and reason uses default": { + job: &batchv1.Job{ + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + { + Type: batchv1.JobFailed, + Status: "True", + }, + }, + }, + }, + expectedDone: false, + expectedFailMsg: "job execution failed", + }, + "job with incomplete condition is in progress": { + job: &batchv1.Job{ + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + { + Type: batchv1.JobComplete, + Status: "False", + }, + }, + }, + }, + expectedDone: false, + expectedFailMsg: "", + }, + "job with failed pods exceeding backoff limit detected early": { + job: &batchv1.Job{ + Spec: batchv1.JobSpec{ + BackoffLimit: &backoffLimitZero, + }, + Status: batchv1.JobStatus{ + Failed: 1, + Active: 0, + }, + }, + expectedDone: false, + expectedFailMsg: "job has 1 failed pod(s) exceeding backoff limit of 0", + }, + "job with failed pods but still active is in progress": { + job: &batchv1.Job{ + Status: batchv1.JobStatus{ + Failed: 1, + Active: 1, + }, + }, + expectedDone: false, + expectedFailMsg: "", + }, + } + + for name, test := range tests { + t.Run(name, func(t *testing.T) { + t.Parallel() + + unstrObj, err := runtime.DefaultUnstructuredConverter.ToUnstructured(test.job) + require.NoError(t, err) + obj := &unstructured.Unstructured{Object: unstrObj} + + done, failMsg, err := checkJobStatus(obj) + if len(test.expectedError) > 0 { + assert.ErrorContains(t, err, test.expectedError) + return + } + + assert.NoError(t, err) + assert.Equal(t, test.expectedDone, done) + assert.Equal(t, test.expectedFailMsg, failMsg) + }) + } +} + +func TestRunPreDeployJobs(t *testing.T) { + t.Parallel() + + namespace := "mlp-predeploy-test" + + tests := map[string]struct { + jobs []*unstructured.Unstructured + dryRun bool + maxRetries int + setupClient func() *dynamicfake.FakeDynamicClient + expectedError string + expectedOutput []string + }{ + "no jobs does nothing": { + jobs: []*unstructured.Unstructured{}, + maxRetries: 3, + setupClient: func() *dynamicfake.FakeDynamicClient { + return dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) + }, + }, + "dry-run applies job without polling": { + jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, + dryRun: true, + maxRetries: 3, + setupClient: func() *dynamicfake.FakeDynamicClient { + client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) + client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, makePreDeployJob("migration"), nil + }) + return client + }, + expectedOutput: []string{ + "pre-deploy job migration: starting (attempt 1/3)", + "pre-deploy job migration: applied (dry-run)", + }, + }, + "successful job completion on first attempt": { + jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, + maxRetries: 3, + setupClient: func() *dynamicfake.FakeDynamicClient { + client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) + client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") + }) + client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, makePreDeployJob("migration"), nil + }) + client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, completedJobUnstructured("migration", namespace), nil + }) + return client + }, + expectedOutput: []string{ + "pre-deploy job migration: starting (attempt 1/3)", + "pre-deploy job migration: completed successfully", + }, + }, + "failed job exhausts all retries": { + jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, + maxRetries: 3, + setupClient: func() *dynamicfake.FakeDynamicClient { + client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) + client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") + }) + client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, makePreDeployJob("migration"), nil + }) + client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, failedJobUnstructured("migration", namespace, "BackoffLimitExceeded"), nil + }) + return client + }, + expectedError: "pre-deploy job migration: failed after 3 attempt(s): job failed: BackoffLimitExceeded", + expectedOutput: []string{ + "pre-deploy job migration: starting (attempt 1/3)", + "pre-deploy job migration: attempt 1/3 failed", + "pre-deploy job migration: starting (attempt 2/3)", + "pre-deploy job migration: attempt 2/3 failed", + "pre-deploy job migration: starting (attempt 3/3)", + "pre-deploy job migration: attempt 3/3 failed", + }, + }, + "failed job succeeds on retry": { + jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, + maxRetries: 3, + setupClient: func() *dynamicfake.FakeDynamicClient { + client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) + client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") + }) + attemptCount := 0 + client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + attemptCount++ + return true, makePreDeployJob("migration"), nil + }) + client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + // First attempt fails, second succeeds + if attemptCount <= 1 { + return true, failedJobUnstructured("migration", namespace, "BackoffLimitExceeded"), nil + } + return true, completedJobUnstructured("migration", namespace), nil + }) + return client + }, + expectedOutput: []string{ + "pre-deploy job migration: starting (attempt 1/3)", + "pre-deploy job migration: attempt 1/3 failed", + "pre-deploy job migration: starting (attempt 2/3)", + "pre-deploy job migration: completed successfully", + }, + }, + "single retry allowed fails immediately": { + jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, + maxRetries: 1, + setupClient: func() *dynamicfake.FakeDynamicClient { + client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) + client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") + }) + client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, makePreDeployJob("migration"), nil + }) + client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, failedJobUnstructured("migration", namespace, "BackoffLimitExceeded"), nil + }) + return client + }, + expectedError: "pre-deploy job migration: failed after 1 attempt(s): job failed: BackoffLimitExceeded", + expectedOutput: []string{ + "pre-deploy job migration: starting (attempt 1/1)", + "pre-deploy job migration: attempt 1/1 failed", + }, + }, + "timeout on single execution retries": { + jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, + maxRetries: 2, + setupClient: func() *dynamicfake.FakeDynamicClient { + client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) + client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") + }) + attemptCount := 0 + client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + attemptCount++ + return true, makePreDeployJob("migration"), nil + }) + client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + // First attempt always returns in-progress to force timeout, + // second attempt succeeds immediately + if attemptCount <= 1 { + return true, inProgressJobUnstructured("migration", namespace), nil + } + return true, completedJobUnstructured("migration", namespace), nil + }) + return client + }, + expectedOutput: []string{ + "pre-deploy job migration: starting (attempt 1/2)", + "pre-deploy job migration: attempt 1/2 failed: timed out waiting for job completion", + "pre-deploy job migration: starting (attempt 2/2)", + "pre-deploy job migration: completed successfully", + }, + }, + "apply failure returns error without retry": { + jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, + maxRetries: 3, + setupClient: func() *dynamicfake.FakeDynamicClient { + client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) + client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") + }) + client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, nil, fmt.Errorf("forbidden") + }) + return client + }, + expectedError: "pre-deploy job migration: failed to apply", + }, + "delete existing job before apply": { + jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, + maxRetries: 3, + setupClient: func() *dynamicfake.FakeDynamicClient { + client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) + client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, nil, nil + }) + client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, makePreDeployJob("migration"), nil + }) + client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, completedJobUnstructured("migration", namespace), nil + }) + return client + }, + expectedOutput: []string{ + "pre-deploy job migration: starting (attempt 1/3)", + "pre-deploy job migration: completed successfully", + }, + }, + "optional job failure does not block deploy": { + jobs: []*unstructured.Unstructured{makeOptionalPreDeployJob("optional-migration")}, + maxRetries: 2, + setupClient: func() *dynamicfake.FakeDynamicClient { + client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) + client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "optional-migration") + }) + client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, makeOptionalPreDeployJob("optional-migration"), nil + }) + client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, failedJobUnstructured("optional-migration", namespace, "BackoffLimitExceeded"), nil + }) + return client + }, + expectedOutput: []string{ + "pre-deploy job optional-migration: starting (attempt 1/2)", + "pre-deploy job optional-migration: attempt 1/2 failed", + "pre-deploy job optional-migration: starting (attempt 2/2)", + "pre-deploy job optional-migration: attempt 2/2 failed", + "pre-deploy job optional-migration: optional job failed after 2 attempt(s), continuing", + }, + }, + "optional job followed by required job continues on optional failure": { + jobs: []*unstructured.Unstructured{ + makeOptionalPreDeployJob("optional-migration"), + makePreDeployJob("required-migration"), + }, + maxRetries: 1, + setupClient: func() *dynamicfake.FakeDynamicClient { + client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) + client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "job") + }) + client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + patchAction := action.(k8stesting.PatchAction) + name := patchAction.GetName() + if name == "optional-migration" { + return true, makeOptionalPreDeployJob(name), nil + } + return true, makePreDeployJob(name), nil + }) + client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + getAction := action.(k8stesting.GetAction) + name := getAction.GetName() + if name == "optional-migration" { + return true, failedJobUnstructured(name, namespace, "BackoffLimitExceeded"), nil + } + return true, completedJobUnstructured(name, namespace), nil + }) + return client + }, + expectedOutput: []string{ + "pre-deploy job optional-migration: starting (attempt 1/1)", + "pre-deploy job optional-migration: optional job failed after 1 attempt(s), continuing", + "pre-deploy job required-migration: starting (attempt 1/1)", + "pre-deploy job required-migration: completed successfully", + }, + }, + "optional job success proceeds normally": { + jobs: []*unstructured.Unstructured{makeOptionalPreDeployJob("optional-migration")}, + maxRetries: 3, + setupClient: func() *dynamicfake.FakeDynamicClient { + client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) + client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "optional-migration") + }) + client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, makeOptionalPreDeployJob("optional-migration"), nil + }) + client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, completedJobUnstructured("optional-migration", namespace), nil + }) + return client + }, + expectedOutput: []string{ + "pre-deploy job optional-migration: starting (attempt 1/3)", + "pre-deploy job optional-migration: completed successfully", + }, + }, + } + + for name, test := range tests { + t.Run(name, func(t *testing.T) { + t.Parallel() + + stringBuilder := new(strings.Builder) + + tf := jpltesting.NewTestClientFactory(). + WithNamespace(namespace) + + if test.setupClient != nil { + tf.FakeDynamicClient = test.setupClient() + } + + // Add batch/v1 Job mapping to the REST mapper + batchGV := batchv1.SchemeGroupVersion + batchMapper := meta.NewDefaultRESTMapper([]schema.GroupVersion{batchGV}) + batchMapper.AddSpecific( + batchGV.WithKind("Job"), + batchGV.WithResource("jobs"), + batchGV.WithResource("job"), + meta.RESTScopeNamespace, + ) + tf.RESTMapper = batchMapper + + // Use a short timeout for tests + ctx := t.Context() + + options := &Options{ + dryRun: test.dryRun, + writer: stringBuilder, + clientFactory: tf, + preDeployPollingInterval: 100 * time.Millisecond, + preDeployJobTimeout: 500 * time.Millisecond, + preDeployJobMaxRetries: test.maxRetries, + } + + err := options.runPreDeployJobs(ctx, test.jobs, namespace) + output := stringBuilder.String() + t.Log(output) + + switch len(test.expectedError) { + case 0: + require.NoError(t, err) + default: + assert.ErrorContains(t, err, test.expectedError) + } + + for _, expected := range test.expectedOutput { + assert.Contains(t, output, expected) + } + }) + } +} + +func makePreDeployJob(name string) *unstructured.Unstructured { + return &unstructured.Unstructured{ + Object: map[string]interface{}{ + "apiVersion": "batch/v1", + "kind": "Job", + "metadata": map[string]interface{}{ + "name": name, + "annotations": map[string]interface{}{ + "mia-platform.eu/deploy": "pre-deploy", + }, + }, + "spec": map[string]interface{}{ + "template": map[string]interface{}{ + "spec": map[string]interface{}{ + "containers": []interface{}{ + map[string]interface{}{ + "name": "migration", + "image": "busybox", + "args": []interface{}{"/bin/sh", "-c", "echo running"}, + }, + }, + "restartPolicy": "Never", + }, + }, + }, + }, + } +} + +func makeOptionalPreDeployJob(name string) *unstructured.Unstructured { + return &unstructured.Unstructured{ + Object: map[string]interface{}{ + "apiVersion": "batch/v1", + "kind": "Job", + "metadata": map[string]interface{}{ + "name": name, + "annotations": map[string]interface{}{ + "mia-platform.eu/deploy": "pre-deploy", + "mia-platform.eu/deploy-optional": "true", + }, + }, + "spec": map[string]interface{}{ + "template": map[string]interface{}{ + "spec": map[string]interface{}{ + "containers": []interface{}{ + map[string]interface{}{ + "name": "migration", + "image": "busybox", + "args": []interface{}{"/bin/sh", "-c", "echo running"}, + }, + }, + "restartPolicy": "Never", + }, + }, + }, + }, + } +} + +func completedJobUnstructured(name, namespace string) *unstructured.Unstructured { + return &unstructured.Unstructured{ + Object: map[string]interface{}{ + "apiVersion": "batch/v1", + "kind": "Job", + "metadata": map[string]interface{}{ + "name": name, + "namespace": namespace, + }, + "status": map[string]interface{}{ + "conditions": []interface{}{ + map[string]interface{}{ + "type": string(batchv1.JobComplete), + "status": "True", + }, + }, + }, + }, + } +} + +func failedJobUnstructured(name, namespace, message string) *unstructured.Unstructured { + return &unstructured.Unstructured{ + Object: map[string]interface{}{ + "apiVersion": "batch/v1", + "kind": "Job", + "metadata": map[string]interface{}{ + "name": name, + "namespace": namespace, + }, + "status": map[string]interface{}{ + "conditions": []interface{}{ + map[string]interface{}{ + "type": string(batchv1.JobFailed), + "status": "True", + "message": message, + }, + }, + }, + }, + } +} + +func inProgressJobUnstructured(name, namespace string) *unstructured.Unstructured { + return &unstructured.Unstructured{ + Object: map[string]interface{}{ + "apiVersion": "batch/v1", + "kind": "Job", + "metadata": map[string]interface{}{ + "name": name, + "namespace": namespace, + }, + "status": map[string]interface{}{}, + }, + } +} diff --git a/pkg/cmd/deploy/testdata/pre-deploy/configmap.yaml b/pkg/cmd/deploy/testdata/pre-deploy/configmap.yaml new file mode 100644 index 0000000..f06e98f --- /dev/null +++ b/pkg/cmd/deploy/testdata/pre-deploy/configmap.yaml @@ -0,0 +1,6 @@ +apiVersion: v1 +kind: ConfigMap +metadata: + name: example +data: + key: value diff --git a/pkg/cmd/deploy/testdata/pre-deploy/deployment.yaml b/pkg/cmd/deploy/testdata/pre-deploy/deployment.yaml new file mode 100644 index 0000000..a71a886 --- /dev/null +++ b/pkg/cmd/deploy/testdata/pre-deploy/deployment.yaml @@ -0,0 +1,32 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: example +spec: + selector: + matchLabels: + app: example + template: + metadata: + annotations: + mia-platform.eu/dependencies-checksum: predefined-value + labels: + app: example + spec: + containers: + - name: example + image: nginx:latest + resources: + limits: + memory: "128Mi" + cpu: "500m" + env: + - name: ENV + valueFrom: + configMapKeyRef: + key: key + name: example + volumes: + - name: example + configMap: + name: example diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-custom-annotation.yaml b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-custom-annotation.yaml new file mode 100644 index 0000000..2727fff --- /dev/null +++ b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-custom-annotation.yaml @@ -0,0 +1,18 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: pre-deploy-custom-migration + annotations: + mia-platform.eu/deploy: custom-pre-deploy +spec: + template: + metadata: {} + spec: + containers: + - name: migration + image: busybox + args: + - /bin/sh + - -c + - echo "running custom annotation migration" + restartPolicy: Never diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-failed.yaml b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-failed.yaml new file mode 100644 index 0000000..c4e9e08 --- /dev/null +++ b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-failed.yaml @@ -0,0 +1,20 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: pre-deploy-migration-fail + annotations: + mia-platform.eu/deploy: pre-deploy +spec: + # backoffLimit is also enforced to 0 by mlp at apply time + backoffLimit: 0 + template: + metadata: {} + spec: + containers: + - name: migration + image: busybox + args: + - /bin/sh + - -c + - echo "migration failed!" && exit 1 + restartPolicy: Never diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml new file mode 100644 index 0000000..e4a6411 --- /dev/null +++ b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml @@ -0,0 +1,20 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: pre-deploy-optional-seed-fail + annotations: + mia-platform.eu/deploy: pre-deploy + mia-platform.eu/deploy-optional: "true" +spec: + backoffLimit: 0 + template: + metadata: {} + spec: + containers: + - name: seed + image: busybox + args: + - /bin/sh + - -c + - echo "optional seed failed!" && exit 1 + restartPolicy: Never diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml new file mode 100644 index 0000000..d080170 --- /dev/null +++ b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml @@ -0,0 +1,20 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: pre-deploy-optional-seed + annotations: + mia-platform.eu/deploy: pre-deploy + mia-platform.eu/deploy-optional: "true" +spec: + backoffLimit: 0 + template: + metadata: {} + spec: + containers: + - name: seed + image: busybox + args: + - /bin/sh + - -c + - echo "optional seed completed" + restartPolicy: Never diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job.yaml b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job.yaml new file mode 100644 index 0000000..ad8035f --- /dev/null +++ b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job.yaml @@ -0,0 +1,18 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: pre-deploy-migration + annotations: + mia-platform.eu/deploy: pre-deploy +spec: + template: + metadata: {} + spec: + containers: + - name: migration + image: busybox + args: + - /bin/sh + - -c + - echo "running migration" + restartPolicy: Never diff --git a/pkg/extensions/predeployfilter.go b/pkg/extensions/predeployfilter.go new file mode 100644 index 0000000..e4a0d0f --- /dev/null +++ b/pkg/extensions/predeployfilter.go @@ -0,0 +1,106 @@ +// Copyright Mia srl +// SPDX-License-Identifier: Apache-2.0 +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package extensions + +import ( + "reflect" + + "github.com/mia-platform/jpl/pkg/client/cache" + "github.com/mia-platform/jpl/pkg/filter" + batchv1 "k8s.io/api/batch/v1" + "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" + "k8s.io/apimachinery/pkg/runtime/schema" +) + +const ( + // PreDeployFilterDefaultValue is the default value for the deploy annotation that identifies + // a Job as a pre-deploy job. It can be overridden via the --pre-deploy-job-annotation CLI flag. + PreDeployFilterDefaultValue = "pre-deploy" + + // preDeployOptionalAnnotation is the annotation used to mark a pre-deploy job as optional. + // If set to "true", a failure of this job will not block the deploy pipeline. + preDeployOptionalAnnotation = miaPlatformPrefix + "deploy-optional" +) + +var ( + jobGK = schema.GroupKind{Group: batchv1.SchemeGroupVersion.Group, Kind: reflect.TypeOf(batchv1.Job{}).Name()} +) + +// preDeployFilter will implement a filter that will remove a Job if it has a matching +// value in the deployFilterAnnotation. These jobs are meant to be executed before +// the main deploy and are handled separately. +type preDeployFilter struct { + annotationValue string +} + +// NewPreDeployFilter return a new filter for intercepting pre-deploy jobs and removing them +// from the main deploy pipeline. The annotationValue parameter specifies the value that the +// deploy annotation must match to identify a pre-deploy job. +func NewPreDeployFilter(annotationValue string) filter.Interface { + return &preDeployFilter{annotationValue: annotationValue} +} + +// Filter implement filter.Interface interface +func (f *preDeployFilter) Filter(obj *unstructured.Unstructured, _ cache.RemoteResourceGetter) (bool, error) { + return IsPreDeployJob(obj, f.annotationValue), nil +} + +// IsPreDeployJob return true if the object is a Job annotated for pre-deploy execution. +// The annotationValue parameter specifies the value that the deploy annotation must match. +func IsPreDeployJob(obj *unstructured.Unstructured, annotationValue string) bool { + if obj.GroupVersionKind().GroupKind() != jobGK { + return false + } + + annotations := obj.GetAnnotations() + if annotations == nil { + return false + } + + return annotations[deployFilterAnnotation] == annotationValue +} + +// IsOptionalPreDeployJob return true if the object has the optional annotation set to "true". +// Optional pre-deploy jobs do not block the deploy pipeline if they fail. +func IsOptionalPreDeployJob(obj *unstructured.Unstructured) bool { + annotations := obj.GetAnnotations() + if annotations == nil { + return false + } + + return annotations[preDeployOptionalAnnotation] == "true" +} + +// ExtractPreDeployJobs separates pre-deploy jobs from the resources list, returning +// the pre-deploy jobs and the remaining resources separately. The annotationValue parameter +// specifies the value that the deploy annotation must match to identify a pre-deploy job. +func ExtractPreDeployJobs(resources []*unstructured.Unstructured, annotationValue string) ([]*unstructured.Unstructured, []*unstructured.Unstructured) { + preDeployJobs := make([]*unstructured.Unstructured, 0) + remaining := make([]*unstructured.Unstructured, 0, len(resources)) + + for _, res := range resources { + if IsPreDeployJob(res, annotationValue) { + preDeployJobs = append(preDeployJobs, res) + } else { + remaining = append(remaining, res) + } + } + + return preDeployJobs, remaining +} + +// keep it to always check if preDeployFilter implement correctly the filter.Interface interface +var _ filter.Interface = &preDeployFilter{} diff --git a/pkg/extensions/predeployfilter_test.go b/pkg/extensions/predeployfilter_test.go new file mode 100644 index 0000000..4149332 --- /dev/null +++ b/pkg/extensions/predeployfilter_test.go @@ -0,0 +1,292 @@ +// Copyright Mia srl +// SPDX-License-Identifier: Apache-2.0 +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package extensions + +import ( + "path/filepath" + "testing" + + jpltesting "github.com/mia-platform/jpl/pkg/testing" + "github.com/stretchr/testify/assert" + "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" +) + +func TestPreDeployFilter(t *testing.T) { + t.Parallel() + + testdata := filepath.Join("testdata", "filter") + preDeployJob := jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "pre-deploy-job.yaml")) + regularJob := jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "regular-job.yaml")) + + tests := map[string]struct { + object *unstructured.Unstructured + annotationValue string + expected bool + }{ + "no filtering for deployment": { + object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "deployment.yaml")), + annotationValue: PreDeployFilterDefaultValue, + expected: false, + }, + "no filtering for configmap": { + object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "configmap.yaml")), + annotationValue: PreDeployFilterDefaultValue, + expected: false, + }, + "no filtering for secret": { + object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "secret.yaml")), + annotationValue: PreDeployFilterDefaultValue, + expected: false, + }, + "no filtering for job without pre-deploy annotation": { + object: regularJob, + annotationValue: PreDeployFilterDefaultValue, + expected: false, + }, + "filtering for job with pre-deploy annotation": { + object: preDeployJob, + annotationValue: PreDeployFilterDefaultValue, + expected: true, + }, + "no filtering when annotation value does not match": { + object: preDeployJob, + annotationValue: "custom-value", + expected: false, + }, + } + + for name, test := range tests { + t.Run(name, func(t *testing.T) { + t.Parallel() + + filter := NewPreDeployFilter(test.annotationValue) + filtered, err := filter.Filter(test.object, nil) + assert.NoError(t, err) + assert.Equal(t, test.expected, filtered) + }) + } +} + +func TestIsPreDeployJob(t *testing.T) { + t.Parallel() + + testdata := filepath.Join("testdata", "filter") + + tests := map[string]struct { + object *unstructured.Unstructured + annotationValue string + expected bool + }{ + "deployment is not a pre-deploy job": { + object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "deployment.yaml")), + annotationValue: PreDeployFilterDefaultValue, + expected: false, + }, + "regular job is not a pre-deploy job": { + object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "regular-job.yaml")), + annotationValue: PreDeployFilterDefaultValue, + expected: false, + }, + "job with pre-deploy annotation is a pre-deploy job": { + object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "pre-deploy-job.yaml")), + annotationValue: PreDeployFilterDefaultValue, + expected: true, + }, + "job with custom annotation value matches": { + object: &unstructured.Unstructured{ + Object: map[string]interface{}{ + "apiVersion": "batch/v1", + "kind": "Job", + "metadata": map[string]interface{}{ + "name": "custom-job", + "annotations": map[string]interface{}{ + "mia-platform.eu/deploy": "custom-pre-deploy", + }, + }, + }, + }, + annotationValue: "custom-pre-deploy", + expected: true, + }, + "job with default annotation does not match custom value": { + object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "pre-deploy-job.yaml")), + annotationValue: "custom-pre-deploy", + expected: false, + }, + } + + for name, test := range tests { + t.Run(name, func(t *testing.T) { + t.Parallel() + + assert.Equal(t, test.expected, IsPreDeployJob(test.object, test.annotationValue)) + }) + } +} + +func TestIsOptionalPreDeployJob(t *testing.T) { + t.Parallel() + + tests := map[string]struct { + object *unstructured.Unstructured + expected bool + }{ + "job without optional annotation is not optional": { + object: &unstructured.Unstructured{ + Object: map[string]interface{}{ + "apiVersion": "batch/v1", + "kind": "Job", + "metadata": map[string]interface{}{ + "name": "migration", + "annotations": map[string]interface{}{ + "mia-platform.eu/deploy": "pre-deploy", + }, + }, + }, + }, + expected: false, + }, + "job with optional annotation set to true is optional": { + object: &unstructured.Unstructured{ + Object: map[string]interface{}{ + "apiVersion": "batch/v1", + "kind": "Job", + "metadata": map[string]interface{}{ + "name": "migration", + "annotations": map[string]interface{}{ + "mia-platform.eu/deploy": "pre-deploy", + "mia-platform.eu/deploy-optional": "true", + }, + }, + }, + }, + expected: true, + }, + "job with optional annotation set to false is not optional": { + object: &unstructured.Unstructured{ + Object: map[string]interface{}{ + "apiVersion": "batch/v1", + "kind": "Job", + "metadata": map[string]interface{}{ + "name": "migration", + "annotations": map[string]interface{}{ + "mia-platform.eu/deploy": "pre-deploy", + "mia-platform.eu/deploy-optional": "false", + }, + }, + }, + }, + expected: false, + }, + "job without annotations is not optional": { + object: &unstructured.Unstructured{ + Object: map[string]interface{}{ + "apiVersion": "batch/v1", + "kind": "Job", + "metadata": map[string]interface{}{ + "name": "migration", + }, + }, + }, + expected: false, + }, + } + + for name, test := range tests { + t.Run(name, func(t *testing.T) { + t.Parallel() + + assert.Equal(t, test.expected, IsOptionalPreDeployJob(test.object)) + }) + } +} + +func TestExtractPreDeployJobs(t *testing.T) { + t.Parallel() + + testdata := filepath.Join("testdata", "filter") + preDeployJob := jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "pre-deploy-job.yaml")) + regularJob := jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "regular-job.yaml")) + deployment := jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "deployment.yaml")) + configmap := jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "configmap.yaml")) + + customJob := &unstructured.Unstructured{ + Object: map[string]interface{}{ + "apiVersion": "batch/v1", + "kind": "Job", + "metadata": map[string]interface{}{ + "name": "custom-job", + "annotations": map[string]interface{}{ + "mia-platform.eu/deploy": "custom-value", + }, + }, + }, + } + + tests := map[string]struct { + resources []*unstructured.Unstructured + annotationValue string + expectedPreDeployJobs int + expectedRemaining int + }{ + "empty resources": { + resources: []*unstructured.Unstructured{}, + annotationValue: PreDeployFilterDefaultValue, + expectedPreDeployJobs: 0, + expectedRemaining: 0, + }, + "no pre-deploy jobs": { + resources: []*unstructured.Unstructured{deployment, configmap, regularJob}, + annotationValue: PreDeployFilterDefaultValue, + expectedPreDeployJobs: 0, + expectedRemaining: 3, + }, + "only pre-deploy jobs": { + resources: []*unstructured.Unstructured{preDeployJob}, + annotationValue: PreDeployFilterDefaultValue, + expectedPreDeployJobs: 1, + expectedRemaining: 0, + }, + "mixed resources with pre-deploy jobs": { + resources: []*unstructured.Unstructured{deployment, preDeployJob, configmap, regularJob}, + annotationValue: PreDeployFilterDefaultValue, + expectedPreDeployJobs: 1, + expectedRemaining: 3, + }, + "custom annotation value extracts correct jobs": { + resources: []*unstructured.Unstructured{deployment, preDeployJob, customJob, regularJob}, + annotationValue: "custom-value", + expectedPreDeployJobs: 1, + expectedRemaining: 3, + }, + "default annotation value does not match custom jobs": { + resources: []*unstructured.Unstructured{customJob, regularJob}, + annotationValue: PreDeployFilterDefaultValue, + expectedPreDeployJobs: 0, + expectedRemaining: 2, + }, + } + + for name, test := range tests { + t.Run(name, func(t *testing.T) { + t.Parallel() + + preDeployJobs, remaining := ExtractPreDeployJobs(test.resources, test.annotationValue) + assert.Len(t, preDeployJobs, test.expectedPreDeployJobs) + assert.Len(t, remaining, test.expectedRemaining) + }) + } +} diff --git a/pkg/extensions/testdata/filter/pre-deploy-job.yaml b/pkg/extensions/testdata/filter/pre-deploy-job.yaml new file mode 100644 index 0000000..f7fdc5f --- /dev/null +++ b/pkg/extensions/testdata/filter/pre-deploy-job.yaml @@ -0,0 +1,18 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: pre-deploy-migration + annotations: + mia-platform.eu/deploy: pre-deploy +spec: + template: + metadata: {} + spec: + containers: + - name: migration + image: busybox + args: + - /bin/sh + - -c + - echo "running failing migration" && exit 1 + restartPolicy: Never diff --git a/pkg/extensions/testdata/filter/regular-job.yaml b/pkg/extensions/testdata/filter/regular-job.yaml new file mode 100644 index 0000000..cf95f55 --- /dev/null +++ b/pkg/extensions/testdata/filter/regular-job.yaml @@ -0,0 +1,16 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: regular-job +spec: + template: + metadata: {} + spec: + containers: + - name: worker + image: busybox + args: + - /bin/sh + - -c + - echo "working" + restartPolicy: Never From 36b0a58efdffaa376ef6162228196dd59790ae57 Mon Sep 17 00:00:00 2001 From: Davide Chiarelli Date: Tue, 3 Mar 2026 11:01:09 +0100 Subject: [PATCH 2/7] fix(pre-commit): solve issues raised by pre-commit checks --- pkg/cmd/deploy/deploy.go | 11 +++-- pkg/cmd/deploy/predeploy.go | 79 +++++++++++++++++--------------- pkg/cmd/deploy/predeploy_test.go | 38 +++++++-------- 3 files changed, 69 insertions(+), 59 deletions(-) diff --git a/pkg/cmd/deploy/deploy.go b/pkg/cmd/deploy/deploy.go index 255fa25..7b3cae5 100644 --- a/pkg/cmd/deploy/deploy.go +++ b/pkg/cmd/deploy/deploy.go @@ -337,13 +337,16 @@ loop: return nil } + return errors.New(formatApplyErrors(errorsDuringApplying)) +} + +func formatApplyErrors(errs []error) string { builder := new(strings.Builder) - fmt.Fprintf(builder, "applying process has encountered %d error(s):\n", len(errorsDuringApplying)) - for _, err := range errorsDuringApplying { + fmt.Fprintf(builder, "applying process has encountered %d error(s):\n", len(errs)) + for _, err := range errs { fmt.Fprintf(builder, "\t- %s\n", err) } - - return errors.New(builder.String()) + return builder.String() } func deployTypeFlagCompletionfunc(*cobra.Command, []string, string) ([]string, cobra.ShellCompDirective) { diff --git a/pkg/cmd/deploy/predeploy.go b/pkg/cmd/deploy/predeploy.go index 8023885..fb67b62 100644 --- a/pkg/cmd/deploy/predeploy.go +++ b/pkg/cmd/deploy/predeploy.go @@ -18,18 +18,20 @@ package deploy import ( "context" "encoding/json" + "errors" "fmt" "time" "github.com/go-logr/logr" - "github.com/mia-platform/mlp/v2/pkg/extensions" batchv1 "k8s.io/api/batch/v1" - "k8s.io/apimachinery/pkg/api/errors" + k8serrors "k8s.io/apimachinery/pkg/api/errors" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" "k8s.io/apimachinery/pkg/runtime" "k8s.io/apimachinery/pkg/types" "k8s.io/client-go/dynamic" + + "github.com/mia-platform/mlp/v2/pkg/extensions" ) const ( @@ -84,38 +86,7 @@ func (o *Options) runPreDeployJobs(ctx context.Context, jobs []*unstructured.Uns jobNamespace = namespace } - var lastErr error - for attempt := 1; attempt <= maxRetries; attempt++ { - logger.V(3).Info("running pre-deploy job", "name", jobName, "namespace", jobNamespace, "attempt", attempt) - fmt.Fprintf(o.writer, "pre-deploy job %s: starting (attempt %d/%d)\n", jobName, attempt, maxRetries) - - if err := o.applyPreDeployJob(ctx, jobResource, job, jobNamespace); err != nil { - return fmt.Errorf("pre-deploy job %s: failed to apply: %w", jobName, err) - } - - if o.dryRun { - fmt.Fprintf(o.writer, "pre-deploy job %s: applied (dry-run)\n", jobName) - lastErr = nil - break - } - - // Create a per-execution timeout context - execCtx := ctx - if o.preDeployJobTimeout > 0 { - var cancel context.CancelFunc - execCtx, cancel = context.WithTimeout(ctx, o.preDeployJobTimeout) - defer cancel() - } - - lastErr = o.waitForJobCompletion(execCtx, jobResource, jobName, jobNamespace, pollingInterval) - if lastErr == nil { - fmt.Fprintf(o.writer, "pre-deploy job %s: completed successfully\n", jobName) - break - } - - fmt.Fprintf(o.writer, "pre-deploy job %s: attempt %d/%d failed: %s\n", jobName, attempt, maxRetries, lastErr) - } - + lastErr := o.executePreDeployJobWithRetries(ctx, jobResource, job, jobName, jobNamespace, maxRetries, pollingInterval, logger) if lastErr != nil { if extensions.IsOptionalPreDeployJob(job) { fmt.Fprintf(o.writer, "pre-deploy job %s: optional job failed after %d attempt(s), continuing: %s\n", jobName, maxRetries, lastErr) @@ -128,6 +99,42 @@ func (o *Options) runPreDeployJobs(ctx context.Context, jobs []*unstructured.Uns return nil } +// executePreDeployJobWithRetries runs a single pre-deploy job with the retry logic. +// Returns the last error if all retries failed, or nil on success. +func (o *Options) executePreDeployJobWithRetries(ctx context.Context, jobResource dynamic.NamespaceableResourceInterface, job *unstructured.Unstructured, jobName, jobNamespace string, maxRetries int, pollingInterval time.Duration, logger logr.Logger) error { + var lastErr error + for attempt := 1; attempt <= maxRetries; attempt++ { + logger.V(3).Info("running pre-deploy job", "name", jobName, "namespace", jobNamespace, "attempt", attempt) + fmt.Fprintf(o.writer, "pre-deploy job %s: starting (attempt %d/%d)\n", jobName, attempt, maxRetries) + + if err := o.applyPreDeployJob(ctx, jobResource, job, jobNamespace); err != nil { + return fmt.Errorf("pre-deploy job %s: failed to apply: %w", jobName, err) + } + + if o.dryRun { + fmt.Fprintf(o.writer, "pre-deploy job %s: applied (dry-run)\n", jobName) + return nil + } + + // Create a per-execution timeout context + execCtx := ctx + if o.preDeployJobTimeout > 0 { + var cancel context.CancelFunc + execCtx, cancel = context.WithTimeout(ctx, o.preDeployJobTimeout) + defer cancel() + } + + lastErr = o.waitForJobCompletion(execCtx, jobResource, jobName, jobNamespace, pollingInterval) + if lastErr == nil { + fmt.Fprintf(o.writer, "pre-deploy job %s: completed successfully\n", jobName) + return nil + } + + fmt.Fprintf(o.writer, "pre-deploy job %s: attempt %d/%d failed: %s\n", jobName, attempt, maxRetries, lastErr) + } + return lastErr +} + // applyPreDeployJob removes any existing job with the same name and applies the new one using server-side apply. // It enforces backoffLimit=0 so the job fails immediately on the first pod failure without Kubernetes-level retries. // Higher-level retries are handled by runPreDeployJobs. @@ -147,7 +154,7 @@ func (o *Options) applyPreDeployJob(ctx context.Context, jobResource dynamic.Nam err := jobResource.Namespace(namespace).Delete(ctx, job.GetName(), metav1.DeleteOptions{ PropagationPolicy: &propagationPolicy, }) - if err != nil && !errors.IsNotFound(err) { + if err != nil && !k8serrors.IsNotFound(err) { return fmt.Errorf("failed to delete existing job: %w", err) } @@ -211,7 +218,7 @@ func (o *Options) waitForJobCompletion(ctx context.Context, jobResource dynamic. // Wait for next poll interval or context cancellation select { case <-ctx.Done(): - return fmt.Errorf("timed out waiting for job completion") + return errors.New("timed out waiting for job completion") case <-ticker.C: // continue to next check } diff --git a/pkg/cmd/deploy/predeploy_test.go b/pkg/cmd/deploy/predeploy_test.go index c218154..196b283 100644 --- a/pkg/cmd/deploy/predeploy_test.go +++ b/pkg/cmd/deploy/predeploy_test.go @@ -16,7 +16,7 @@ package deploy import ( - "fmt" + "errors" "strings" "testing" "time" @@ -25,7 +25,7 @@ import ( "github.com/stretchr/testify/assert" "github.com/stretchr/testify/require" batchv1 "k8s.io/api/batch/v1" - "k8s.io/apimachinery/pkg/api/errors" + k8serrors "k8s.io/apimachinery/pkg/api/errors" "k8s.io/apimachinery/pkg/api/meta" "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" "k8s.io/apimachinery/pkg/runtime" @@ -212,7 +212,7 @@ func TestRunPreDeployJobs(t *testing.T) { setupClient: func() *dynamicfake.FakeDynamicClient { client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") + return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") }) client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { return true, makePreDeployJob("migration"), nil @@ -233,13 +233,13 @@ func TestRunPreDeployJobs(t *testing.T) { setupClient: func() *dynamicfake.FakeDynamicClient { client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") + return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") }) client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { return true, makePreDeployJob("migration"), nil }) client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, failedJobUnstructured("migration", namespace, "BackoffLimitExceeded"), nil + return true, failedJobUnstructured("migration", namespace), nil }) return client }, @@ -259,7 +259,7 @@ func TestRunPreDeployJobs(t *testing.T) { setupClient: func() *dynamicfake.FakeDynamicClient { client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") + return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") }) attemptCount := 0 client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { @@ -269,7 +269,7 @@ func TestRunPreDeployJobs(t *testing.T) { client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { // First attempt fails, second succeeds if attemptCount <= 1 { - return true, failedJobUnstructured("migration", namespace, "BackoffLimitExceeded"), nil + return true, failedJobUnstructured("migration", namespace), nil } return true, completedJobUnstructured("migration", namespace), nil }) @@ -288,13 +288,13 @@ func TestRunPreDeployJobs(t *testing.T) { setupClient: func() *dynamicfake.FakeDynamicClient { client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") + return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") }) client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { return true, makePreDeployJob("migration"), nil }) client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, failedJobUnstructured("migration", namespace, "BackoffLimitExceeded"), nil + return true, failedJobUnstructured("migration", namespace), nil }) return client }, @@ -310,7 +310,7 @@ func TestRunPreDeployJobs(t *testing.T) { setupClient: func() *dynamicfake.FakeDynamicClient { client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") + return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") }) attemptCount := 0 client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { @@ -340,10 +340,10 @@ func TestRunPreDeployJobs(t *testing.T) { setupClient: func() *dynamicfake.FakeDynamicClient { client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") + return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") }) client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, fmt.Errorf("forbidden") + return true, nil, errors.New("forbidden") }) return client }, @@ -376,13 +376,13 @@ func TestRunPreDeployJobs(t *testing.T) { setupClient: func() *dynamicfake.FakeDynamicClient { client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "optional-migration") + return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "optional-migration") }) client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { return true, makeOptionalPreDeployJob("optional-migration"), nil }) client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, failedJobUnstructured("optional-migration", namespace, "BackoffLimitExceeded"), nil + return true, failedJobUnstructured("optional-migration", namespace), nil }) return client }, @@ -403,7 +403,7 @@ func TestRunPreDeployJobs(t *testing.T) { setupClient: func() *dynamicfake.FakeDynamicClient { client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "job") + return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "job") }) client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { patchAction := action.(k8stesting.PatchAction) @@ -417,7 +417,7 @@ func TestRunPreDeployJobs(t *testing.T) { getAction := action.(k8stesting.GetAction) name := getAction.GetName() if name == "optional-migration" { - return true, failedJobUnstructured(name, namespace, "BackoffLimitExceeded"), nil + return true, failedJobUnstructured(name, namespace), nil } return true, completedJobUnstructured(name, namespace), nil }) @@ -436,7 +436,7 @@ func TestRunPreDeployJobs(t *testing.T) { setupClient: func() *dynamicfake.FakeDynamicClient { client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, errors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "optional-migration") + return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "optional-migration") }) client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { return true, makeOptionalPreDeployJob("optional-migration"), nil @@ -587,7 +587,7 @@ func completedJobUnstructured(name, namespace string) *unstructured.Unstructured } } -func failedJobUnstructured(name, namespace, message string) *unstructured.Unstructured { +func failedJobUnstructured(name, namespace string) *unstructured.Unstructured { return &unstructured.Unstructured{ Object: map[string]interface{}{ "apiVersion": "batch/v1", @@ -601,7 +601,7 @@ func failedJobUnstructured(name, namespace, message string) *unstructured.Unstru map[string]interface{}{ "type": string(batchv1.JobFailed), "status": "True", - "message": message, + "message": "BackoffLimitExceeded", }, }, }, From 2844f77f15cd7ddb7e0156dff0241052a95a4ba0 Mon Sep 17 00:00:00 2001 From: Davide Chiarelli Date: Thu, 5 Mar 2026 18:09:29 +0100 Subject: [PATCH 3/7] refactor(pre-deploy): changed the logic implemented, filtering resources based on annotations - Deleted the predeployfilter.go and predeployfilter_test.go files, which contained logic for filtering pre-deploy jobs based on annotations. - Removed associated test data files for pre-deploy jobs, including optional and custom annotation jobs. - Added comprehensive tests for filtering and executing pre-deploy jobs in predeployjob_test.go, ensuring correct behavior for various scenarios. --- pkg/cmd/deploy/deploy.go | 98 ++- pkg/cmd/deploy/predeploy.go | 270 ------- pkg/cmd/deploy/predeploy_test.go | 624 ---------------- pkg/cmd/deploy/predeployjob.go | 246 +++++++ pkg/cmd/deploy/predeployjob_test.go | 665 ++++++++++++++++++ .../pre-deploy-job-custom-annotation.yaml | 18 - .../pre-deploy-job-optional-failed.yaml | 20 - .../pre-deploy/pre-deploy-job-optional.yaml | 20 - pkg/extensions/predeployfilter.go | 106 --- pkg/extensions/predeployfilter_test.go | 292 -------- 10 files changed, 971 insertions(+), 1388 deletions(-) delete mode 100644 pkg/cmd/deploy/predeploy.go delete mode 100644 pkg/cmd/deploy/predeploy_test.go create mode 100644 pkg/cmd/deploy/predeployjob.go create mode 100644 pkg/cmd/deploy/predeployjob_test.go delete mode 100644 pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-custom-annotation.yaml delete mode 100644 pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml delete mode 100644 pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml delete mode 100644 pkg/extensions/predeployfilter.go delete mode 100644 pkg/extensions/predeployfilter_test.go diff --git a/pkg/cmd/deploy/deploy.go b/pkg/cmd/deploy/deploy.go index 7b3cae5..b78cb26 100644 --- a/pkg/cmd/deploy/deploy.go +++ b/pkg/cmd/deploy/deploy.go @@ -82,16 +82,16 @@ const ( waitFlagDefaultValue = true waitFlagUsage = "if true, wait for resources to be current before marking them as successfully applied" - preDeployJobTimeoutFlagName = "pre-deploy-job-timeout" - preDeployJobTimeoutDefaultValue = 30 * time.Second - preDeployJobTimeoutFlagUsage = "the length of time to wait before giving up on a single pre-deploy job execution. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h)" + preDeployJobAnnotationFlagName = "pre-deploy-job-annotation" + preDeployJobAnnotationFlagUsage = "the annotation value for mia-platform.eu/deploy to identify pre-deploy jobs" preDeployJobMaxRetriesFlagName = "pre-deploy-job-max-retries" preDeployJobMaxRetriesDefaultValue = 3 - preDeployJobMaxRetriesFlagUsage = "the maximum number of attempts for a pre-deploy job before aborting the deploy" + preDeployJobMaxRetriesFlagUsage = "the maximum number of retries for a failed pre-deploy job" - preDeployJobAnnotationFlagName = "pre-deploy-job-annotation" - preDeployJobAnnotationFlagUsage = "the value of the mia-platform.eu/deploy annotation used to identify pre-deploy jobs" + preDeployJobTimeoutFlagName = "pre-deploy-job-timeout" + preDeployJobTimeoutDefaultValue = 30 * time.Second + preDeployJobTimeoutFlagUsage = "the timeout for a single pre-deploy job execution" stdinToken = "-" fieldManager = "mlp" @@ -114,26 +114,25 @@ type Flags struct { forceDeploy bool ensureNamespace bool timeout time.Duration - preDeployJobTimeout time.Duration - preDeployJobMaxRetries int - preDeployJobAnnotation string dryRun bool wait bool + preDeployJobAnnotation string + preDeployJobMaxRetries int + preDeployJobTimeout time.Duration } // Options have the data required to perform the deploy operation type Options struct { - inputPaths []string - deployType string - forceDeploy bool - ensureNamespace bool - timeout time.Duration - preDeployJobTimeout time.Duration - preDeployJobMaxRetries int - preDeployJobAnnotation string - preDeployPollingInterval time.Duration - dryRun bool - wait bool + inputPaths []string + deployType string + forceDeploy bool + ensureNamespace bool + timeout time.Duration + dryRun bool + wait bool + preDeployJobAnnotation string + preDeployJobMaxRetries int + preDeployJobTimeout time.Duration clientFactory util.ClientFactory clock clock.PassiveClock @@ -203,9 +202,9 @@ func (f *Flags) AddFlags(flags *pflag.FlagSet) { flags.DurationVar(&f.timeout, timeoutFlagName, timeoutDefaultValue, timeoutFlagUsage) flags.BoolVar(&f.dryRun, dryRunFlagName, dryRunDefaultValue, dryRunFlagUsage) flags.BoolVar(&f.wait, waitFlagName, waitFlagDefaultValue, waitFlagUsage) - flags.DurationVar(&f.preDeployJobTimeout, preDeployJobTimeoutFlagName, preDeployJobTimeoutDefaultValue, preDeployJobTimeoutFlagUsage) + flags.StringVar(&f.preDeployJobAnnotation, preDeployJobAnnotationFlagName, "", preDeployJobAnnotationFlagUsage) flags.IntVar(&f.preDeployJobMaxRetries, preDeployJobMaxRetriesFlagName, preDeployJobMaxRetriesDefaultValue, preDeployJobMaxRetriesFlagUsage) - flags.StringVar(&f.preDeployJobAnnotation, preDeployJobAnnotationFlagName, extensions.PreDeployFilterDefaultValue, preDeployJobAnnotationFlagUsage) + flags.DurationVar(&f.preDeployJobTimeout, preDeployJobTimeoutFlagName, preDeployJobTimeoutDefaultValue, preDeployJobTimeoutFlagUsage) } // ToOptions transform the command flags in command runtime arguments @@ -220,10 +219,10 @@ func (f *Flags) ToOptions(reader io.Reader, writer io.Writer) (*Options, error) forceDeploy: f.forceDeploy, ensureNamespace: f.ensureNamespace, timeout: f.timeout, - preDeployJobTimeout: f.preDeployJobTimeout, - preDeployJobMaxRetries: f.preDeployJobMaxRetries, - preDeployJobAnnotation: f.preDeployJobAnnotation, wait: f.wait, + preDeployJobAnnotation: f.preDeployJobAnnotation, + preDeployJobMaxRetries: f.preDeployJobMaxRetries, + preDeployJobTimeout: f.preDeployJobTimeout, clientFactory: util.NewFactory(f.ConfigFlags), reader: reader, @@ -267,21 +266,17 @@ func (o *Options) Run(ctx context.Context) error { return err } - annotationValue := o.preDeployJobAnnotation - if annotationValue == "" { - annotationValue = extensions.PreDeployFilterDefaultValue + if err := o.ensuringNamespace(ctx, namespace); err != nil { + return nil } - preDeployJobs, resources := extensions.ExtractPreDeployJobs(resources, annotationValue) - if len(preDeployJobs) > 0 { - logger.V(3).Info("found pre-deploy jobs", "count", len(preDeployJobs)) - - if err := o.runPreDeployJobs(ctx, preDeployJobs, namespace); err != nil { - return fmt.Errorf("pre-deploy phase failed: %w", err) - } + var stop bool + resources, stop, err = o.runPreDeployPhase(ctx, namespace, resources) + if err != nil { + return err } - - if err := o.ensuringNamespace(ctx, namespace); err != nil { + if stop { + logger.V(3).Info("pre-deploy jobs executed, skipping remaining resources apply") return nil } @@ -298,7 +293,7 @@ func (o *Options) Run(ctx context.Context) error { extensions.NewDeployMutator(o.deployType, o.forceDeploy, extensions.ChecksumFromData(deployIdentifier)), extensions.NewExternalSecretsMutator(resources), ). - WithFilters(extensions.NewDeployOnceFilter(), extensions.NewPreDeployFilter(annotationValue)). + WithFilters(extensions.NewDeployOnceFilter()). WithCustomStatusChecker(extensions.ExternalSecretStatusCheckers()). Build() if err != nil { @@ -349,6 +344,33 @@ func formatApplyErrors(errs []error) string { return builder.String() } +// runPreDeployPhase handles pre-deploy job filtering and execution. When the pre-deploy +// annotation flag is set, matching jobs are extracted and executed; if any are found the +// caller should stop the normal apply phase (stop=true). When the flag is absent, annotated +// jobs are stripped from the resource list so they are never applied as regular resources. +func (o *Options) runPreDeployPhase(ctx context.Context, namespace string, resources []*unstructured.Unstructured) ([]*unstructured.Unstructured, bool, error) { + if o.preDeployJobAnnotation == "" { + return StripAnnotatedJobs(resources), false, nil + } + + preDeployJobs, remaining := FilterPreDeployJobs(resources, o.preDeployJobAnnotation) + if len(preDeployJobs) == 0 { + return resources, false, nil + } + + clientSet, err := o.clientFactory.KubernetesClientSet() + if err != nil { + return nil, false, err + } + + runner := NewPreDeployJobRunner(clientSet, namespace, o.preDeployJobMaxRetries, o.preDeployJobTimeout, o.writer, o.dryRun) + if err := runner.Run(ctx, preDeployJobs); err != nil { + return nil, false, err + } + + return remaining, true, nil +} + func deployTypeFlagCompletionfunc(*cobra.Command, []string, string) ([]string, cobra.ShellCompDirective) { return validDeployTypeValues, cobra.ShellCompDirectiveDefault } diff --git a/pkg/cmd/deploy/predeploy.go b/pkg/cmd/deploy/predeploy.go deleted file mode 100644 index fb67b62..0000000 --- a/pkg/cmd/deploy/predeploy.go +++ /dev/null @@ -1,270 +0,0 @@ -// Copyright Mia srl -// SPDX-License-Identifier: Apache-2.0 -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -package deploy - -import ( - "context" - "encoding/json" - "errors" - "fmt" - "time" - - "github.com/go-logr/logr" - batchv1 "k8s.io/api/batch/v1" - k8serrors "k8s.io/apimachinery/pkg/api/errors" - metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" - "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" - "k8s.io/apimachinery/pkg/runtime" - "k8s.io/apimachinery/pkg/types" - "k8s.io/client-go/dynamic" - - "github.com/mia-platform/mlp/v2/pkg/extensions" -) - -const ( - preDeployPollingInterval = 5 * time.Second -) - -// runPreDeployJobs applies each pre-deploy job to the cluster and waits for its completion. -// Each job execution has its own timeout (preDeployJobTimeout). If a job fails, it is retried -// up to preDeployJobMaxRetries times. If all retries are exhausted and the job is not marked -// as optional (via the mia-platform.eu/deploy-optional annotation), the deploy process is aborted. -// Optional jobs that fail are logged as warnings and do not block the deploy pipeline. -// In dry-run mode, jobs are applied with the dry-run option and polling is skipped. -func (o *Options) runPreDeployJobs(ctx context.Context, jobs []*unstructured.Unstructured, namespace string) error { - if len(jobs) == 0 { - return nil - } - - logger := logr.FromContextOrDiscard(ctx) - - pollingInterval := preDeployPollingInterval - if o.preDeployPollingInterval > 0 { - pollingInterval = o.preDeployPollingInterval - } - - maxRetries := o.preDeployJobMaxRetries - if maxRetries <= 0 { - maxRetries = 1 - } - - dynamicClient, err := o.clientFactory.DynamicClient() - if err != nil { - return fmt.Errorf("failed to create dynamic client for pre-deploy jobs: %w", err) - } - - mapper, err := o.clientFactory.ToRESTMapper() - if err != nil { - return fmt.Errorf("failed to create REST mapper for pre-deploy jobs: %w", err) - } - - jobGVK := batchv1.SchemeGroupVersion.WithKind("Job") - mapping, err := mapper.RESTMapping(jobGVK.GroupKind(), jobGVK.Version) - if err != nil { - return fmt.Errorf("failed to get REST mapping for Job: %w", err) - } - - jobResource := dynamicClient.Resource(mapping.Resource) - - for _, job := range jobs { - jobName := job.GetName() - jobNamespace := job.GetNamespace() - if jobNamespace == "" { - jobNamespace = namespace - } - - lastErr := o.executePreDeployJobWithRetries(ctx, jobResource, job, jobName, jobNamespace, maxRetries, pollingInterval, logger) - if lastErr != nil { - if extensions.IsOptionalPreDeployJob(job) { - fmt.Fprintf(o.writer, "pre-deploy job %s: optional job failed after %d attempt(s), continuing: %s\n", jobName, maxRetries, lastErr) - continue - } - return fmt.Errorf("pre-deploy job %s: failed after %d attempt(s): %w", jobName, maxRetries, lastErr) - } - } - - return nil -} - -// executePreDeployJobWithRetries runs a single pre-deploy job with the retry logic. -// Returns the last error if all retries failed, or nil on success. -func (o *Options) executePreDeployJobWithRetries(ctx context.Context, jobResource dynamic.NamespaceableResourceInterface, job *unstructured.Unstructured, jobName, jobNamespace string, maxRetries int, pollingInterval time.Duration, logger logr.Logger) error { - var lastErr error - for attempt := 1; attempt <= maxRetries; attempt++ { - logger.V(3).Info("running pre-deploy job", "name", jobName, "namespace", jobNamespace, "attempt", attempt) - fmt.Fprintf(o.writer, "pre-deploy job %s: starting (attempt %d/%d)\n", jobName, attempt, maxRetries) - - if err := o.applyPreDeployJob(ctx, jobResource, job, jobNamespace); err != nil { - return fmt.Errorf("pre-deploy job %s: failed to apply: %w", jobName, err) - } - - if o.dryRun { - fmt.Fprintf(o.writer, "pre-deploy job %s: applied (dry-run)\n", jobName) - return nil - } - - // Create a per-execution timeout context - execCtx := ctx - if o.preDeployJobTimeout > 0 { - var cancel context.CancelFunc - execCtx, cancel = context.WithTimeout(ctx, o.preDeployJobTimeout) - defer cancel() - } - - lastErr = o.waitForJobCompletion(execCtx, jobResource, jobName, jobNamespace, pollingInterval) - if lastErr == nil { - fmt.Fprintf(o.writer, "pre-deploy job %s: completed successfully\n", jobName) - return nil - } - - fmt.Fprintf(o.writer, "pre-deploy job %s: attempt %d/%d failed: %s\n", jobName, attempt, maxRetries, lastErr) - } - return lastErr -} - -// applyPreDeployJob removes any existing job with the same name and applies the new one using server-side apply. -// It enforces backoffLimit=0 so the job fails immediately on the first pod failure without Kubernetes-level retries. -// Higher-level retries are handled by runPreDeployJobs. -func (o *Options) applyPreDeployJob(ctx context.Context, jobResource dynamic.NamespaceableResourceInterface, job *unstructured.Unstructured, namespace string) error { - logger := logr.FromContextOrDiscard(ctx) - - // Deep copy to avoid mutating the original object - job = job.DeepCopy() - // Force backoffLimit=0: on failure the job fails immediately and we handle retries ourselves - if err := unstructured.SetNestedField(job.Object, int64(0), "spec", "backoffLimit"); err != nil { - return fmt.Errorf("failed to set backoffLimit: %w", err) - } - - if !o.dryRun { - // Delete any existing job with the same name to ensure a clean run - propagationPolicy := metav1.DeletePropagationBackground - err := jobResource.Namespace(namespace).Delete(ctx, job.GetName(), metav1.DeleteOptions{ - PropagationPolicy: &propagationPolicy, - }) - if err != nil && !k8serrors.IsNotFound(err) { - return fmt.Errorf("failed to delete existing job: %w", err) - } - - if err == nil { - logger.V(5).Info("deleted existing pre-deploy job", "name", job.GetName(), "namespace", namespace) - } - } - - data, err := json.Marshal(job.Object) - if err != nil { - return fmt.Errorf("failed to marshal job: %w", err) - } - - patchOpts := metav1.PatchOptions{ - FieldManager: fieldManager, - } - if o.dryRun { - patchOpts.DryRun = []string{metav1.DryRunAll} - } - - _, err = jobResource.Namespace(namespace).Patch(ctx, job.GetName(), types.ApplyPatchType, data, patchOpts) - if err != nil { - return err - } - - return nil -} - -// waitForJobCompletion polls the job status until it completes, fails, or the context times out. -// The status is checked immediately after calling this function, then at each polling interval. -func (o *Options) waitForJobCompletion(ctx context.Context, jobResource dynamic.NamespaceableResourceInterface, name, namespace string, pollingInterval time.Duration) error { - logger := logr.FromContextOrDiscard(ctx) - - ticker := time.NewTicker(pollingInterval) - defer ticker.Stop() - - for { - // Check status immediately on each iteration (including the first one) - logger.V(5).Info("polling pre-deploy job status", "name", name, "namespace", namespace) - - obj, err := jobResource.Namespace(namespace).Get(ctx, name, metav1.GetOptions{}) - if err != nil { - return fmt.Errorf("failed to get job status: %w", err) - } - - completed, errMsg, err := checkJobStatus(obj) - if err != nil { - return err - } - - if errMsg != "" { - return fmt.Errorf("job failed: %s", errMsg) - } - - if completed { - return nil - } - - fmt.Fprintf(o.writer, "pre-deploy job %s: waiting for completion\n", name) - - // Wait for next poll interval or context cancellation - select { - case <-ctx.Done(): - return errors.New("timed out waiting for job completion") - case <-ticker.C: - // continue to next check - } - } -} - -// checkJobStatus examines the job's status conditions to determine if it has completed or failed. -// Returns (completed, failMessage, error) -func checkJobStatus(obj *unstructured.Unstructured) (bool, string, error) { - job := new(batchv1.Job) - if err := runtime.DefaultUnstructuredConverter.FromUnstructured(obj.Object, job); err != nil { - return false, "", fmt.Errorf("failed to convert job status: %w", err) - } - - for _, condition := range job.Status.Conditions { - switch condition.Type { - case batchv1.JobComplete: - if condition.Status == "True" { - return true, "", nil - } - case batchv1.JobFailed: - if condition.Status == "True" { - msg := condition.Message - if msg == "" { - msg = condition.Reason - } - if msg == "" { - msg = "job execution failed" - } - return false, msg, nil - } - } - } - - // Detect failure before the controller sets the Failed condition: - // if there are no active pods and the number of failures exceeds the backoff limit, - // the job has effectively failed - if job.Status.Active == 0 && job.Status.Failed > 0 { - backoffLimit := int32(6) // Kubernetes default - if job.Spec.BackoffLimit != nil { - backoffLimit = *job.Spec.BackoffLimit - } - if job.Status.Failed > backoffLimit { - return false, fmt.Sprintf("job has %d failed pod(s) exceeding backoff limit of %d", job.Status.Failed, backoffLimit), nil - } - } - - return false, "", nil -} diff --git a/pkg/cmd/deploy/predeploy_test.go b/pkg/cmd/deploy/predeploy_test.go deleted file mode 100644 index 196b283..0000000 --- a/pkg/cmd/deploy/predeploy_test.go +++ /dev/null @@ -1,624 +0,0 @@ -// Copyright Mia srl -// SPDX-License-Identifier: Apache-2.0 -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -package deploy - -import ( - "errors" - "strings" - "testing" - "time" - - jpltesting "github.com/mia-platform/jpl/pkg/testing" - "github.com/stretchr/testify/assert" - "github.com/stretchr/testify/require" - batchv1 "k8s.io/api/batch/v1" - k8serrors "k8s.io/apimachinery/pkg/api/errors" - "k8s.io/apimachinery/pkg/api/meta" - "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" - "k8s.io/apimachinery/pkg/runtime" - "k8s.io/apimachinery/pkg/runtime/schema" - dynamicfake "k8s.io/client-go/dynamic/fake" - k8stesting "k8s.io/client-go/testing" -) - -func TestCheckJobStatus(t *testing.T) { - t.Parallel() - - backoffLimitZero := int32(0) - - tests := map[string]struct { - job *batchv1.Job - expectedDone bool - expectedFailMsg string - expectedError string - }{ - "job without conditions is in progress": { - job: &batchv1.Job{ - Status: batchv1.JobStatus{}, - }, - expectedDone: false, - expectedFailMsg: "", - }, - "job with complete condition is done": { - job: &batchv1.Job{ - Status: batchv1.JobStatus{ - Conditions: []batchv1.JobCondition{ - { - Type: batchv1.JobComplete, - Status: "True", - }, - }, - }, - }, - expectedDone: true, - expectedFailMsg: "", - }, - "job with failed condition returns failure message": { - job: &batchv1.Job{ - Status: batchv1.JobStatus{ - Conditions: []batchv1.JobCondition{ - { - Type: batchv1.JobFailed, - Status: "True", - Message: "BackoffLimitExceeded", - }, - }, - }, - }, - expectedDone: false, - expectedFailMsg: "BackoffLimitExceeded", - }, - "job with failed condition and empty message uses reason": { - job: &batchv1.Job{ - Status: batchv1.JobStatus{ - Conditions: []batchv1.JobCondition{ - { - Type: batchv1.JobFailed, - Status: "True", - Reason: "DeadlineExceeded", - }, - }, - }, - }, - expectedDone: false, - expectedFailMsg: "DeadlineExceeded", - }, - "job with failed condition and empty message and reason uses default": { - job: &batchv1.Job{ - Status: batchv1.JobStatus{ - Conditions: []batchv1.JobCondition{ - { - Type: batchv1.JobFailed, - Status: "True", - }, - }, - }, - }, - expectedDone: false, - expectedFailMsg: "job execution failed", - }, - "job with incomplete condition is in progress": { - job: &batchv1.Job{ - Status: batchv1.JobStatus{ - Conditions: []batchv1.JobCondition{ - { - Type: batchv1.JobComplete, - Status: "False", - }, - }, - }, - }, - expectedDone: false, - expectedFailMsg: "", - }, - "job with failed pods exceeding backoff limit detected early": { - job: &batchv1.Job{ - Spec: batchv1.JobSpec{ - BackoffLimit: &backoffLimitZero, - }, - Status: batchv1.JobStatus{ - Failed: 1, - Active: 0, - }, - }, - expectedDone: false, - expectedFailMsg: "job has 1 failed pod(s) exceeding backoff limit of 0", - }, - "job with failed pods but still active is in progress": { - job: &batchv1.Job{ - Status: batchv1.JobStatus{ - Failed: 1, - Active: 1, - }, - }, - expectedDone: false, - expectedFailMsg: "", - }, - } - - for name, test := range tests { - t.Run(name, func(t *testing.T) { - t.Parallel() - - unstrObj, err := runtime.DefaultUnstructuredConverter.ToUnstructured(test.job) - require.NoError(t, err) - obj := &unstructured.Unstructured{Object: unstrObj} - - done, failMsg, err := checkJobStatus(obj) - if len(test.expectedError) > 0 { - assert.ErrorContains(t, err, test.expectedError) - return - } - - assert.NoError(t, err) - assert.Equal(t, test.expectedDone, done) - assert.Equal(t, test.expectedFailMsg, failMsg) - }) - } -} - -func TestRunPreDeployJobs(t *testing.T) { - t.Parallel() - - namespace := "mlp-predeploy-test" - - tests := map[string]struct { - jobs []*unstructured.Unstructured - dryRun bool - maxRetries int - setupClient func() *dynamicfake.FakeDynamicClient - expectedError string - expectedOutput []string - }{ - "no jobs does nothing": { - jobs: []*unstructured.Unstructured{}, - maxRetries: 3, - setupClient: func() *dynamicfake.FakeDynamicClient { - return dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) - }, - }, - "dry-run applies job without polling": { - jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, - dryRun: true, - maxRetries: 3, - setupClient: func() *dynamicfake.FakeDynamicClient { - client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) - client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, makePreDeployJob("migration"), nil - }) - return client - }, - expectedOutput: []string{ - "pre-deploy job migration: starting (attempt 1/3)", - "pre-deploy job migration: applied (dry-run)", - }, - }, - "successful job completion on first attempt": { - jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, - maxRetries: 3, - setupClient: func() *dynamicfake.FakeDynamicClient { - client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) - client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") - }) - client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, makePreDeployJob("migration"), nil - }) - client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, completedJobUnstructured("migration", namespace), nil - }) - return client - }, - expectedOutput: []string{ - "pre-deploy job migration: starting (attempt 1/3)", - "pre-deploy job migration: completed successfully", - }, - }, - "failed job exhausts all retries": { - jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, - maxRetries: 3, - setupClient: func() *dynamicfake.FakeDynamicClient { - client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) - client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") - }) - client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, makePreDeployJob("migration"), nil - }) - client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, failedJobUnstructured("migration", namespace), nil - }) - return client - }, - expectedError: "pre-deploy job migration: failed after 3 attempt(s): job failed: BackoffLimitExceeded", - expectedOutput: []string{ - "pre-deploy job migration: starting (attempt 1/3)", - "pre-deploy job migration: attempt 1/3 failed", - "pre-deploy job migration: starting (attempt 2/3)", - "pre-deploy job migration: attempt 2/3 failed", - "pre-deploy job migration: starting (attempt 3/3)", - "pre-deploy job migration: attempt 3/3 failed", - }, - }, - "failed job succeeds on retry": { - jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, - maxRetries: 3, - setupClient: func() *dynamicfake.FakeDynamicClient { - client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) - client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") - }) - attemptCount := 0 - client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - attemptCount++ - return true, makePreDeployJob("migration"), nil - }) - client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - // First attempt fails, second succeeds - if attemptCount <= 1 { - return true, failedJobUnstructured("migration", namespace), nil - } - return true, completedJobUnstructured("migration", namespace), nil - }) - return client - }, - expectedOutput: []string{ - "pre-deploy job migration: starting (attempt 1/3)", - "pre-deploy job migration: attempt 1/3 failed", - "pre-deploy job migration: starting (attempt 2/3)", - "pre-deploy job migration: completed successfully", - }, - }, - "single retry allowed fails immediately": { - jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, - maxRetries: 1, - setupClient: func() *dynamicfake.FakeDynamicClient { - client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) - client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") - }) - client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, makePreDeployJob("migration"), nil - }) - client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, failedJobUnstructured("migration", namespace), nil - }) - return client - }, - expectedError: "pre-deploy job migration: failed after 1 attempt(s): job failed: BackoffLimitExceeded", - expectedOutput: []string{ - "pre-deploy job migration: starting (attempt 1/1)", - "pre-deploy job migration: attempt 1/1 failed", - }, - }, - "timeout on single execution retries": { - jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, - maxRetries: 2, - setupClient: func() *dynamicfake.FakeDynamicClient { - client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) - client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") - }) - attemptCount := 0 - client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - attemptCount++ - return true, makePreDeployJob("migration"), nil - }) - client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - // First attempt always returns in-progress to force timeout, - // second attempt succeeds immediately - if attemptCount <= 1 { - return true, inProgressJobUnstructured("migration", namespace), nil - } - return true, completedJobUnstructured("migration", namespace), nil - }) - return client - }, - expectedOutput: []string{ - "pre-deploy job migration: starting (attempt 1/2)", - "pre-deploy job migration: attempt 1/2 failed: timed out waiting for job completion", - "pre-deploy job migration: starting (attempt 2/2)", - "pre-deploy job migration: completed successfully", - }, - }, - "apply failure returns error without retry": { - jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, - maxRetries: 3, - setupClient: func() *dynamicfake.FakeDynamicClient { - client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) - client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "migration") - }) - client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, errors.New("forbidden") - }) - return client - }, - expectedError: "pre-deploy job migration: failed to apply", - }, - "delete existing job before apply": { - jobs: []*unstructured.Unstructured{makePreDeployJob("migration")}, - maxRetries: 3, - setupClient: func() *dynamicfake.FakeDynamicClient { - client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) - client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, nil - }) - client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, makePreDeployJob("migration"), nil - }) - client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, completedJobUnstructured("migration", namespace), nil - }) - return client - }, - expectedOutput: []string{ - "pre-deploy job migration: starting (attempt 1/3)", - "pre-deploy job migration: completed successfully", - }, - }, - "optional job failure does not block deploy": { - jobs: []*unstructured.Unstructured{makeOptionalPreDeployJob("optional-migration")}, - maxRetries: 2, - setupClient: func() *dynamicfake.FakeDynamicClient { - client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) - client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "optional-migration") - }) - client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, makeOptionalPreDeployJob("optional-migration"), nil - }) - client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, failedJobUnstructured("optional-migration", namespace), nil - }) - return client - }, - expectedOutput: []string{ - "pre-deploy job optional-migration: starting (attempt 1/2)", - "pre-deploy job optional-migration: attempt 1/2 failed", - "pre-deploy job optional-migration: starting (attempt 2/2)", - "pre-deploy job optional-migration: attempt 2/2 failed", - "pre-deploy job optional-migration: optional job failed after 2 attempt(s), continuing", - }, - }, - "optional job followed by required job continues on optional failure": { - jobs: []*unstructured.Unstructured{ - makeOptionalPreDeployJob("optional-migration"), - makePreDeployJob("required-migration"), - }, - maxRetries: 1, - setupClient: func() *dynamicfake.FakeDynamicClient { - client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) - client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "job") - }) - client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - patchAction := action.(k8stesting.PatchAction) - name := patchAction.GetName() - if name == "optional-migration" { - return true, makeOptionalPreDeployJob(name), nil - } - return true, makePreDeployJob(name), nil - }) - client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - getAction := action.(k8stesting.GetAction) - name := getAction.GetName() - if name == "optional-migration" { - return true, failedJobUnstructured(name, namespace), nil - } - return true, completedJobUnstructured(name, namespace), nil - }) - return client - }, - expectedOutput: []string{ - "pre-deploy job optional-migration: starting (attempt 1/1)", - "pre-deploy job optional-migration: optional job failed after 1 attempt(s), continuing", - "pre-deploy job required-migration: starting (attempt 1/1)", - "pre-deploy job required-migration: completed successfully", - }, - }, - "optional job success proceeds normally": { - jobs: []*unstructured.Unstructured{makeOptionalPreDeployJob("optional-migration")}, - maxRetries: 3, - setupClient: func() *dynamicfake.FakeDynamicClient { - client := dynamicfake.NewSimpleDynamicClient(jpltesting.Scheme) - client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, nil, k8serrors.NewNotFound(schema.GroupResource{Group: "batch", Resource: "jobs"}, "optional-migration") - }) - client.PrependReactor("patch", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, makeOptionalPreDeployJob("optional-migration"), nil - }) - client.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { - return true, completedJobUnstructured("optional-migration", namespace), nil - }) - return client - }, - expectedOutput: []string{ - "pre-deploy job optional-migration: starting (attempt 1/3)", - "pre-deploy job optional-migration: completed successfully", - }, - }, - } - - for name, test := range tests { - t.Run(name, func(t *testing.T) { - t.Parallel() - - stringBuilder := new(strings.Builder) - - tf := jpltesting.NewTestClientFactory(). - WithNamespace(namespace) - - if test.setupClient != nil { - tf.FakeDynamicClient = test.setupClient() - } - - // Add batch/v1 Job mapping to the REST mapper - batchGV := batchv1.SchemeGroupVersion - batchMapper := meta.NewDefaultRESTMapper([]schema.GroupVersion{batchGV}) - batchMapper.AddSpecific( - batchGV.WithKind("Job"), - batchGV.WithResource("jobs"), - batchGV.WithResource("job"), - meta.RESTScopeNamespace, - ) - tf.RESTMapper = batchMapper - - // Use a short timeout for tests - ctx := t.Context() - - options := &Options{ - dryRun: test.dryRun, - writer: stringBuilder, - clientFactory: tf, - preDeployPollingInterval: 100 * time.Millisecond, - preDeployJobTimeout: 500 * time.Millisecond, - preDeployJobMaxRetries: test.maxRetries, - } - - err := options.runPreDeployJobs(ctx, test.jobs, namespace) - output := stringBuilder.String() - t.Log(output) - - switch len(test.expectedError) { - case 0: - require.NoError(t, err) - default: - assert.ErrorContains(t, err, test.expectedError) - } - - for _, expected := range test.expectedOutput { - assert.Contains(t, output, expected) - } - }) - } -} - -func makePreDeployJob(name string) *unstructured.Unstructured { - return &unstructured.Unstructured{ - Object: map[string]interface{}{ - "apiVersion": "batch/v1", - "kind": "Job", - "metadata": map[string]interface{}{ - "name": name, - "annotations": map[string]interface{}{ - "mia-platform.eu/deploy": "pre-deploy", - }, - }, - "spec": map[string]interface{}{ - "template": map[string]interface{}{ - "spec": map[string]interface{}{ - "containers": []interface{}{ - map[string]interface{}{ - "name": "migration", - "image": "busybox", - "args": []interface{}{"/bin/sh", "-c", "echo running"}, - }, - }, - "restartPolicy": "Never", - }, - }, - }, - }, - } -} - -func makeOptionalPreDeployJob(name string) *unstructured.Unstructured { - return &unstructured.Unstructured{ - Object: map[string]interface{}{ - "apiVersion": "batch/v1", - "kind": "Job", - "metadata": map[string]interface{}{ - "name": name, - "annotations": map[string]interface{}{ - "mia-platform.eu/deploy": "pre-deploy", - "mia-platform.eu/deploy-optional": "true", - }, - }, - "spec": map[string]interface{}{ - "template": map[string]interface{}{ - "spec": map[string]interface{}{ - "containers": []interface{}{ - map[string]interface{}{ - "name": "migration", - "image": "busybox", - "args": []interface{}{"/bin/sh", "-c", "echo running"}, - }, - }, - "restartPolicy": "Never", - }, - }, - }, - }, - } -} - -func completedJobUnstructured(name, namespace string) *unstructured.Unstructured { - return &unstructured.Unstructured{ - Object: map[string]interface{}{ - "apiVersion": "batch/v1", - "kind": "Job", - "metadata": map[string]interface{}{ - "name": name, - "namespace": namespace, - }, - "status": map[string]interface{}{ - "conditions": []interface{}{ - map[string]interface{}{ - "type": string(batchv1.JobComplete), - "status": "True", - }, - }, - }, - }, - } -} - -func failedJobUnstructured(name, namespace string) *unstructured.Unstructured { - return &unstructured.Unstructured{ - Object: map[string]interface{}{ - "apiVersion": "batch/v1", - "kind": "Job", - "metadata": map[string]interface{}{ - "name": name, - "namespace": namespace, - }, - "status": map[string]interface{}{ - "conditions": []interface{}{ - map[string]interface{}{ - "type": string(batchv1.JobFailed), - "status": "True", - "message": "BackoffLimitExceeded", - }, - }, - }, - }, - } -} - -func inProgressJobUnstructured(name, namespace string) *unstructured.Unstructured { - return &unstructured.Unstructured{ - Object: map[string]interface{}{ - "apiVersion": "batch/v1", - "kind": "Job", - "metadata": map[string]interface{}{ - "name": name, - "namespace": namespace, - }, - "status": map[string]interface{}{}, - }, - } -} diff --git a/pkg/cmd/deploy/predeployjob.go b/pkg/cmd/deploy/predeployjob.go new file mode 100644 index 0000000..7f650c0 --- /dev/null +++ b/pkg/cmd/deploy/predeployjob.go @@ -0,0 +1,246 @@ +// Copyright Mia srl +// SPDX-License-Identifier: Apache-2.0 +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package deploy + +import ( + "context" + "fmt" + "io" + "strings" + "time" + + "github.com/go-logr/logr" + batchv1 "k8s.io/api/batch/v1" + corev1 "k8s.io/api/core/v1" + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" + "k8s.io/apimachinery/pkg/runtime" + "k8s.io/client-go/kubernetes" +) + +const ( + preDeployAnnotation = "mia-platform.eu/deploy" + jobKind = "Job" +) + +// PreDeployJobRunner handles the creation, execution and monitoring of pre-deploy jobs +// with configurable retry and timeout support. +type PreDeployJobRunner struct { + clientSet kubernetes.Interface + namespace string + maxRetries int + timeout time.Duration + pollInterval time.Duration + writer io.Writer + dryRun bool +} + +// NewPreDeployJobRunner creates a new PreDeployJobRunner configured with the specified parameters +// for running pre-deploy jobs against the target cluster. +func NewPreDeployJobRunner(clientSet kubernetes.Interface, namespace string, maxRetries int, timeout time.Duration, writer io.Writer, dryRun bool) *PreDeployJobRunner { + return &PreDeployJobRunner{ + clientSet: clientSet, + namespace: namespace, + maxRetries: maxRetries, + timeout: timeout, + pollInterval: 1 * time.Second, + writer: writer, + dryRun: dryRun, + } +} + +// FilterPreDeployJobs separates pre-deploy jobs from the remaining resources based on the +// mia-platform.eu/deploy annotation matching the provided annotation value. +// It returns two slices: the matching pre-deploy jobs and the remaining resources. +func FilterPreDeployJobs(resources []*unstructured.Unstructured, annotationValue string) ([]*unstructured.Unstructured, []*unstructured.Unstructured) { + var preDeployJobs []*unstructured.Unstructured + var remainingResources []*unstructured.Unstructured + + for _, res := range resources { + if res.GetKind() == jobKind { + annotations := res.GetAnnotations() + if val, ok := annotations[preDeployAnnotation]; ok && val == annotationValue { + preDeployJobs = append(preDeployJobs, res) + continue + } + } + remainingResources = append(remainingResources, res) + } + + return preDeployJobs, remainingResources +} + +// StripAnnotatedJobs removes all Job resources that carry the mia-platform.eu/deploy +// annotation, regardless of its value. This is used to exclude pre-deploy jobs from the +// normal apply flow when the pre-deploy-job-annotation flag is not provided. +func StripAnnotatedJobs(resources []*unstructured.Unstructured) []*unstructured.Unstructured { + var remaining []*unstructured.Unstructured + + for _, res := range resources { + if res.GetKind() == jobKind { + if _, ok := res.GetAnnotations()[preDeployAnnotation]; ok { + continue + } + } + remaining = append(remaining, res) + } + + return remaining +} + +// Run executes all pre-deploy jobs with retry and timeout support. Each job is retried +// up to maxRetries times upon failure. An error is returned only if ALL annotated jobs fail; +// if at least one succeeds, it returns nil and the deploy process can continue. +func (r *PreDeployJobRunner) Run(ctx context.Context, jobs []*unstructured.Unstructured) error { + logger := logr.FromContextOrDiscard(ctx) + + if len(jobs) == 0 { + logger.V(3).Info("no pre-deploy jobs to run") + return nil + } + + if r.dryRun { + for _, job := range jobs { + fmt.Fprintf(r.writer, "pre-deploy job %q would be executed (dry-run)\n", job.GetName()) + } + return nil + } + + logger.V(3).Info("starting pre-deploy jobs", "count", len(jobs)) + + var failedJobs []string + successCount := 0 + + for _, job := range jobs { + jobName := job.GetName() + err := r.runJobWithRetries(ctx, job) + if err != nil { + fmt.Fprintf(r.writer, "pre-deploy job %q failed after %d retries: %s\n", jobName, r.maxRetries, err) + failedJobs = append(failedJobs, jobName) + } else { + fmt.Fprintf(r.writer, "pre-deploy job %q completed successfully\n", jobName) + successCount++ + } + } + + if successCount == 0 { + builder := new(strings.Builder) + fmt.Fprintf(builder, "all %d pre-deploy job(s) failed:\n", len(failedJobs)) + for _, name := range failedJobs { + fmt.Fprintf(builder, "\t- %s\n", name) + } + return fmt.Errorf("%s", builder.String()) + } + + logger.V(3).Info("pre-deploy jobs completed", "succeeded", successCount, "failed", len(failedJobs)) + return nil +} + +// runJobWithRetries attempts to run a single pre-deploy job, retrying up to maxRetries times +// on failure. The failed job is deleted before each retry attempt. +func (r *PreDeployJobRunner) runJobWithRetries(ctx context.Context, jobUnstr *unstructured.Unstructured) error { + logger := logr.FromContextOrDiscard(ctx) + var lastErr error + + for attempt := 0; attempt <= r.maxRetries; attempt++ { + if attempt > 0 { + logger.V(3).Info("retrying pre-deploy job", "name", jobUnstr.GetName(), "attempt", attempt) + fmt.Fprintf(r.writer, "retrying pre-deploy job %q (attempt %d/%d)\n", jobUnstr.GetName(), attempt, r.maxRetries) + } + + lastErr = r.createAndWaitForJob(ctx, jobUnstr) + if lastErr == nil { + return nil + } + + logger.V(5).Info("pre-deploy job attempt failed", "name", jobUnstr.GetName(), "attempt", attempt, "error", lastErr) + + // Clean up the failed job before retrying + if cleanErr := r.deleteJob(ctx, jobUnstr.GetName()); cleanErr != nil { + logger.V(5).Info("failed to clean up job", "name", jobUnstr.GetName(), "error", cleanErr) + } + } + + return lastErr +} + +// createAndWaitForJob creates a single job in the cluster and waits for its completion. +// It converts the unstructured resource to a typed Job, sets the namespace, and monitors +// the job status until completion, failure, or timeout. +func (r *PreDeployJobRunner) createAndWaitForJob(ctx context.Context, jobUnstr *unstructured.Unstructured) error { + logger := logr.FromContextOrDiscard(ctx) + + var job batchv1.Job + if err := runtime.DefaultUnstructuredConverter.FromUnstructured(jobUnstr.Object, &job); err != nil { + return fmt.Errorf("failed to convert job %q: %w", jobUnstr.GetName(), err) + } + + job.Namespace = r.namespace + // Clear resource version and status for creation + job.ResourceVersion = "" + job.Status = batchv1.JobStatus{} + + logger.V(5).Info("creating pre-deploy job", "name", job.Name, "namespace", r.namespace) + + if _, err := r.clientSet.BatchV1().Jobs(r.namespace).Create(ctx, &job, metav1.CreateOptions{}); err != nil { + return fmt.Errorf("failed to create job %q: %w", job.Name, err) + } + + return r.waitForJobCompletion(ctx, job.Name) +} + +// waitForJobCompletion polls the job status at regular intervals until the job completes, +// fails, or the configured timeout expires. +func (r *PreDeployJobRunner) waitForJobCompletion(ctx context.Context, name string) error { + logger := logr.FromContextOrDiscard(ctx) + + timeoutCtx, cancel := context.WithTimeout(ctx, r.timeout) + defer cancel() + + ticker := time.NewTicker(r.pollInterval) + defer ticker.Stop() + + for { + select { + case <-timeoutCtx.Done(): + return fmt.Errorf("job %q timed out after %s", name, r.timeout) + case <-ticker.C: + job, err := r.clientSet.BatchV1().Jobs(r.namespace).Get(timeoutCtx, name, metav1.GetOptions{}) + if err != nil { + return fmt.Errorf("failed to get job %q status: %w", name, err) + } + + logger.V(10).Info("polling job status", "name", name, "active", job.Status.Active, "succeeded", job.Status.Succeeded, "failed", job.Status.Failed) + + for _, condition := range job.Status.Conditions { + if condition.Type == batchv1.JobComplete && condition.Status == corev1.ConditionTrue { + return nil + } + if condition.Type == batchv1.JobFailed && condition.Status == corev1.ConditionTrue { + return fmt.Errorf("job %q failed: %s", name, condition.Message) + } + } + } + } +} + +// deleteJob removes a job and its associated pods from the cluster using background propagation +func (r *PreDeployJobRunner) deleteJob(ctx context.Context, name string) error { + propagation := metav1.DeletePropagationBackground + return r.clientSet.BatchV1().Jobs(r.namespace).Delete(ctx, name, metav1.DeleteOptions{ + PropagationPolicy: &propagation, + }) +} diff --git a/pkg/cmd/deploy/predeployjob_test.go b/pkg/cmd/deploy/predeployjob_test.go new file mode 100644 index 0000000..7a829a6 --- /dev/null +++ b/pkg/cmd/deploy/predeployjob_test.go @@ -0,0 +1,665 @@ +// Copyright Mia srl +// SPDX-License-Identifier: Apache-2.0 +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package deploy + +import ( + "errors" + "strings" + "testing" + "time" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + batchv1 "k8s.io/api/batch/v1" + corev1 "k8s.io/api/core/v1" + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" + "k8s.io/apimachinery/pkg/runtime" + kubefake "k8s.io/client-go/kubernetes/fake" + k8stesting "k8s.io/client-go/testing" +) + +// newTestJob creates an unstructured Job resource for testing with the given name and optional +// pre-deploy annotation value. If annotationValue is empty, no annotation is added. +func newTestJob(name, annotationValue string) *unstructured.Unstructured { + obj := &unstructured.Unstructured{ + Object: map[string]interface{}{ + "apiVersion": "batch/v1", + "kind": "Job", + "metadata": map[string]interface{}{ + "name": name, + }, + "spec": map[string]interface{}{ + "template": map[string]interface{}{ + "spec": map[string]interface{}{ + "containers": []interface{}{ + map[string]interface{}{ + "name": "test", + "image": "busybox", + }, + }, + "restartPolicy": "Never", + }, + }, + }, + }, + } + + if annotationValue != "" { + obj.SetAnnotations(map[string]string{ + preDeployAnnotation: annotationValue, + }) + } + + return obj +} + +func TestFilterPreDeployJobs(t *testing.T) { + t.Parallel() + + tests := map[string]struct { + resources []*unstructured.Unstructured + annotationValue string + expectedJobs int + expectedRemaining int + }{ + "no resources": { + resources: nil, + annotationValue: "pre-deploy", + expectedJobs: 0, + expectedRemaining: 0, + }, + "no matching jobs": { + resources: []*unstructured.Unstructured{ + newTestJob("job1", ""), + { + Object: map[string]interface{}{ + "apiVersion": "v1", + "kind": "ConfigMap", + "metadata": map[string]interface{}{"name": "cm1"}, + }, + }, + }, + annotationValue: "pre-deploy", + expectedJobs: 0, + expectedRemaining: 2, + }, + "matching jobs separated from remaining": { + resources: []*unstructured.Unstructured{ + newTestJob("pre-job", "pre-deploy"), + newTestJob("normal-job", ""), + { + Object: map[string]interface{}{ + "apiVersion": "apps/v1", + "kind": "Deployment", + "metadata": map[string]interface{}{"name": "deploy1"}, + }, + }, + }, + annotationValue: "pre-deploy", + expectedJobs: 1, + expectedRemaining: 2, + }, + "non-job resource with annotation not filtered": { + resources: []*unstructured.Unstructured{ + { + Object: map[string]interface{}{ + "apiVersion": "v1", + "kind": "ConfigMap", + "metadata": map[string]interface{}{ + "name": "cm-with-annotation", + "annotations": map[string]interface{}{preDeployAnnotation: "pre-deploy"}, + }, + }, + }, + }, + annotationValue: "pre-deploy", + expectedJobs: 0, + expectedRemaining: 1, + }, + "job with different annotation value not filtered": { + resources: []*unstructured.Unstructured{ + newTestJob("job-once", "once"), + }, + annotationValue: "pre-deploy", + expectedJobs: 0, + expectedRemaining: 1, + }, + "multiple matching jobs": { + resources: []*unstructured.Unstructured{ + newTestJob("pre-job-1", "pre-deploy"), + newTestJob("pre-job-2", "pre-deploy"), + newTestJob("normal-job", ""), + }, + annotationValue: "pre-deploy", + expectedJobs: 2, + expectedRemaining: 1, + }, + "custom annotation value": { + resources: []*unstructured.Unstructured{ + newTestJob("custom-job", "custom-value"), + newTestJob("pre-job", "pre-deploy"), + }, + annotationValue: "custom-value", + expectedJobs: 1, + expectedRemaining: 1, + }, + } + + for name, test := range tests { + t.Run(name, func(t *testing.T) { + t.Parallel() + + jobs, remaining := FilterPreDeployJobs(test.resources, test.annotationValue) + assert.Len(t, jobs, test.expectedJobs) + assert.Len(t, remaining, test.expectedRemaining) + }) + } +} + +func TestStripAnnotatedJobs(t *testing.T) { + t.Parallel() + + tests := map[string]struct { + resources []*unstructured.Unstructured + expectedRemaining int + }{ + "no resources": { + resources: nil, + expectedRemaining: 0, + }, + "no annotated jobs": { + resources: []*unstructured.Unstructured{ + newTestJob("normal-job", ""), + { + Object: map[string]interface{}{ + "apiVersion": "v1", + "kind": "ConfigMap", + "metadata": map[string]interface{}{"name": "cm1"}, + }, + }, + }, + expectedRemaining: 2, + }, + "strips jobs with any annotation value": { + resources: []*unstructured.Unstructured{ + newTestJob("pre-job", "pre-deploy"), + newTestJob("custom-job", "custom-value"), + newTestJob("normal-job", ""), + { + Object: map[string]interface{}{ + "apiVersion": "apps/v1", + "kind": "Deployment", + "metadata": map[string]interface{}{"name": "deploy1"}, + }, + }, + }, + expectedRemaining: 2, + }, + "non-job resource with annotation is kept": { + resources: []*unstructured.Unstructured{ + { + Object: map[string]interface{}{ + "apiVersion": "v1", + "kind": "ConfigMap", + "metadata": map[string]interface{}{ + "name": "cm-annotated", + "annotations": map[string]interface{}{preDeployAnnotation: "pre-deploy"}, + }, + }, + }, + }, + expectedRemaining: 1, + }, + "all annotated jobs stripped": { + resources: []*unstructured.Unstructured{ + newTestJob("job-1", "pre-deploy"), + newTestJob("job-2", "other"), + }, + expectedRemaining: 0, + }, + } + + for name, test := range tests { + t.Run(name, func(t *testing.T) { + t.Parallel() + + remaining := StripAnnotatedJobs(test.resources) + assert.Len(t, remaining, test.expectedRemaining) + }) + } +} + +func TestPreDeployJobRunnerRun(t *testing.T) { + t.Parallel() + + namespace := "test-ns" + + tests := map[string]struct { + jobs []*unstructured.Unstructured + setupReactor func(*kubefake.Clientset) + dryRun bool + maxRetries int + expectedError string + expectedOut string + }{ + "no jobs returns nil": { + jobs: nil, + maxRetries: 3, + }, + "dry run skips execution": { + jobs: []*unstructured.Unstructured{newTestJob("pre-job", "pre-deploy")}, + dryRun: true, + maxRetries: 3, + expectedOut: "would be executed (dry-run)", + }, + "successful job completion": { + jobs: []*unstructured.Unstructured{newTestJob("pre-job", "pre-deploy")}, + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{ + Name: action.(k8stesting.GetAction).GetName(), + Namespace: namespace, + }, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + { + Type: batchv1.JobComplete, + Status: corev1.ConditionTrue, + }, + }, + }, + }, nil + }) + }, + maxRetries: 3, + expectedOut: "completed successfully", + }, + "all jobs fail returns error": { + jobs: []*unstructured.Unstructured{newTestJob("fail-job", "pre-deploy")}, + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{ + Name: action.(k8stesting.GetAction).GetName(), + Namespace: namespace, + }, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + { + Type: batchv1.JobFailed, + Status: corev1.ConditionTrue, + Message: "BackoffLimitExceeded", + }, + }, + }, + }, nil + }) + }, + maxRetries: 1, + expectedError: "all 1 pre-deploy job(s) failed", + }, + "partial success continues deploy": { + jobs: []*unstructured.Unstructured{ + newTestJob("success-job", "pre-deploy"), + newTestJob("fail-job", "pre-deploy"), + }, + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + getAction := action.(k8stesting.GetAction) + name := getAction.GetName() + if name == "success-job" { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: name, Namespace: namespace}, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobComplete, Status: corev1.ConditionTrue}, + }, + }, + }, nil + } + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: name, Namespace: namespace}, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobFailed, Status: corev1.ConditionTrue, Message: "failed"}, + }, + }, + }, nil + }) + }, + maxRetries: 0, + expectedOut: "completed successfully", + }, + "multiple jobs all fail": { + jobs: []*unstructured.Unstructured{ + newTestJob("fail-job-1", "pre-deploy"), + newTestJob("fail-job-2", "pre-deploy"), + }, + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{ + Name: action.(k8stesting.GetAction).GetName(), + Namespace: namespace, + }, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobFailed, Status: corev1.ConditionTrue, Message: "error"}, + }, + }, + }, nil + }) + }, + maxRetries: 0, + expectedError: "all 2 pre-deploy job(s) failed", + }, + } + + for name, test := range tests { + t.Run(name, func(t *testing.T) { + t.Parallel() + + fakeClientset := kubefake.NewSimpleClientset() + if test.setupReactor != nil { + test.setupReactor(fakeClientset) + } + + writer := new(strings.Builder) + runner := NewPreDeployJobRunner(fakeClientset, namespace, test.maxRetries, 5*time.Second, writer, test.dryRun) + runner.pollInterval = 10 * time.Millisecond + + err := runner.Run(t.Context(), test.jobs) + + switch len(test.expectedError) { + case 0: + require.NoError(t, err) + default: + assert.ErrorContains(t, err, test.expectedError) + } + + if test.expectedOut != "" { + assert.Contains(t, writer.String(), test.expectedOut) + } + + t.Log(writer.String()) + }) + } +} + +func TestRunJobWithRetries(t *testing.T) { + t.Parallel() + + namespace := "test-ns" + + tests := map[string]struct { + maxRetries int + setupReactor func(*kubefake.Clientset) + expectedError string + }{ + "success on first attempt": { + maxRetries: 3, + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: "test-job", Namespace: namespace}, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobComplete, Status: corev1.ConditionTrue}, + }, + }, + }, nil + }) + }, + }, + "success after retry": { + maxRetries: 3, + setupReactor: func(cs *kubefake.Clientset) { + callCount := 0 + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + callCount++ + if callCount <= 1 { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: "test-job", Namespace: namespace}, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobFailed, Status: corev1.ConditionTrue, Message: "failed"}, + }, + }, + }, nil + } + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: "test-job", Namespace: namespace}, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobComplete, Status: corev1.ConditionTrue}, + }, + }, + }, nil + }) + }, + }, + "all retries exhausted": { + maxRetries: 2, + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: "test-job", Namespace: namespace}, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobFailed, Status: corev1.ConditionTrue, Message: "BackoffLimitExceeded"}, + }, + }, + }, nil + }) + }, + expectedError: `job "test-job" failed: BackoffLimitExceeded`, + }, + } + + for name, test := range tests { + t.Run(name, func(t *testing.T) { + t.Parallel() + + fakeClientset := kubefake.NewSimpleClientset() + test.setupReactor(fakeClientset) + + writer := new(strings.Builder) + runner := NewPreDeployJobRunner(fakeClientset, namespace, test.maxRetries, 5*time.Second, writer, false) + runner.pollInterval = 10 * time.Millisecond + + job := newTestJob("test-job", "pre-deploy") + err := runner.runJobWithRetries(t.Context(), job) + + switch len(test.expectedError) { + case 0: + require.NoError(t, err) + default: + assert.ErrorContains(t, err, test.expectedError) + } + + t.Log(writer.String()) + }) + } +} + +func TestWaitForJobCompletion(t *testing.T) { + t.Parallel() + + namespace := "test-ns" + + tests := map[string]struct { + setupReactor func(*kubefake.Clientset) + timeout time.Duration + expectedError string + }{ + "job completes successfully": { + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: "test-job", Namespace: namespace}, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobComplete, Status: corev1.ConditionTrue}, + }, + }, + }, nil + }) + }, + timeout: 5 * time.Second, + }, + "job fails": { + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: "test-job", Namespace: namespace}, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobFailed, Status: corev1.ConditionTrue, Message: "container crashed"}, + }, + }, + }, nil + }) + }, + timeout: 5 * time.Second, + expectedError: `job "test-job" failed: container crashed`, + }, + "job times out": { + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: "test-job", Namespace: namespace}, + }, nil + }) + }, + timeout: 100 * time.Millisecond, + expectedError: `job "test-job" timed out`, + }, + "get job returns error": { + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, nil, errors.New("connection refused") + }) + }, + timeout: 5 * time.Second, + expectedError: `failed to get job "test-job" status`, + }, + } + + for name, test := range tests { + t.Run(name, func(t *testing.T) { + t.Parallel() + + fakeClientset := kubefake.NewSimpleClientset() + test.setupReactor(fakeClientset) + + writer := new(strings.Builder) + runner := NewPreDeployJobRunner(fakeClientset, namespace, 3, test.timeout, writer, false) + runner.pollInterval = 10 * time.Millisecond + + err := runner.waitForJobCompletion(t.Context(), "test-job") + + switch len(test.expectedError) { + case 0: + require.NoError(t, err) + default: + assert.ErrorContains(t, err, test.expectedError) + } + }) + } +} + +func TestCreateAndWaitForJob(t *testing.T) { + t.Parallel() + + namespace := "test-ns" + + tests := map[string]struct { + job *unstructured.Unstructured + setupReactor func(*kubefake.Clientset) + expectedError string + }{ + "successful creation and completion": { + job: newTestJob("test-job", "pre-deploy"), + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: "test-job", Namespace: namespace}, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobComplete, Status: corev1.ConditionTrue}, + }, + }, + }, nil + }) + }, + }, + "creation failure": { + job: newTestJob("test-job", "pre-deploy"), + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("create", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, nil, errors.New("forbidden") + }) + }, + expectedError: `failed to create job "test-job"`, + }, + } + + for name, test := range tests { + t.Run(name, func(t *testing.T) { + t.Parallel() + + fakeClientset := kubefake.NewSimpleClientset() + test.setupReactor(fakeClientset) + + writer := new(strings.Builder) + runner := NewPreDeployJobRunner(fakeClientset, namespace, 3, 5*time.Second, writer, false) + runner.pollInterval = 10 * time.Millisecond + + err := runner.createAndWaitForJob(t.Context(), test.job) + + switch len(test.expectedError) { + case 0: + require.NoError(t, err) + default: + assert.ErrorContains(t, err, test.expectedError) + } + }) + } +} + +func TestDeleteJob(t *testing.T) { + t.Parallel() + + namespace := "test-ns" + fakeClientset := kubefake.NewSimpleClientset(&batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test-job", + Namespace: namespace, + }, + }) + + writer := new(strings.Builder) + runner := NewPreDeployJobRunner(fakeClientset, namespace, 3, 30*time.Second, writer, false) + + err := runner.deleteJob(t.Context(), "test-job") + require.NoError(t, err) + + // Verify job is deleted + _, err = fakeClientset.BatchV1().Jobs(namespace).Get(t.Context(), "test-job", metav1.GetOptions{}) + assert.Error(t, err) +} diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-custom-annotation.yaml b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-custom-annotation.yaml deleted file mode 100644 index 2727fff..0000000 --- a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-custom-annotation.yaml +++ /dev/null @@ -1,18 +0,0 @@ -apiVersion: batch/v1 -kind: Job -metadata: - name: pre-deploy-custom-migration - annotations: - mia-platform.eu/deploy: custom-pre-deploy -spec: - template: - metadata: {} - spec: - containers: - - name: migration - image: busybox - args: - - /bin/sh - - -c - - echo "running custom annotation migration" - restartPolicy: Never diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml deleted file mode 100644 index e4a6411..0000000 --- a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml +++ /dev/null @@ -1,20 +0,0 @@ -apiVersion: batch/v1 -kind: Job -metadata: - name: pre-deploy-optional-seed-fail - annotations: - mia-platform.eu/deploy: pre-deploy - mia-platform.eu/deploy-optional: "true" -spec: - backoffLimit: 0 - template: - metadata: {} - spec: - containers: - - name: seed - image: busybox - args: - - /bin/sh - - -c - - echo "optional seed failed!" && exit 1 - restartPolicy: Never diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml deleted file mode 100644 index d080170..0000000 --- a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml +++ /dev/null @@ -1,20 +0,0 @@ -apiVersion: batch/v1 -kind: Job -metadata: - name: pre-deploy-optional-seed - annotations: - mia-platform.eu/deploy: pre-deploy - mia-platform.eu/deploy-optional: "true" -spec: - backoffLimit: 0 - template: - metadata: {} - spec: - containers: - - name: seed - image: busybox - args: - - /bin/sh - - -c - - echo "optional seed completed" - restartPolicy: Never diff --git a/pkg/extensions/predeployfilter.go b/pkg/extensions/predeployfilter.go deleted file mode 100644 index e4a0d0f..0000000 --- a/pkg/extensions/predeployfilter.go +++ /dev/null @@ -1,106 +0,0 @@ -// Copyright Mia srl -// SPDX-License-Identifier: Apache-2.0 -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -package extensions - -import ( - "reflect" - - "github.com/mia-platform/jpl/pkg/client/cache" - "github.com/mia-platform/jpl/pkg/filter" - batchv1 "k8s.io/api/batch/v1" - "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" - "k8s.io/apimachinery/pkg/runtime/schema" -) - -const ( - // PreDeployFilterDefaultValue is the default value for the deploy annotation that identifies - // a Job as a pre-deploy job. It can be overridden via the --pre-deploy-job-annotation CLI flag. - PreDeployFilterDefaultValue = "pre-deploy" - - // preDeployOptionalAnnotation is the annotation used to mark a pre-deploy job as optional. - // If set to "true", a failure of this job will not block the deploy pipeline. - preDeployOptionalAnnotation = miaPlatformPrefix + "deploy-optional" -) - -var ( - jobGK = schema.GroupKind{Group: batchv1.SchemeGroupVersion.Group, Kind: reflect.TypeOf(batchv1.Job{}).Name()} -) - -// preDeployFilter will implement a filter that will remove a Job if it has a matching -// value in the deployFilterAnnotation. These jobs are meant to be executed before -// the main deploy and are handled separately. -type preDeployFilter struct { - annotationValue string -} - -// NewPreDeployFilter return a new filter for intercepting pre-deploy jobs and removing them -// from the main deploy pipeline. The annotationValue parameter specifies the value that the -// deploy annotation must match to identify a pre-deploy job. -func NewPreDeployFilter(annotationValue string) filter.Interface { - return &preDeployFilter{annotationValue: annotationValue} -} - -// Filter implement filter.Interface interface -func (f *preDeployFilter) Filter(obj *unstructured.Unstructured, _ cache.RemoteResourceGetter) (bool, error) { - return IsPreDeployJob(obj, f.annotationValue), nil -} - -// IsPreDeployJob return true if the object is a Job annotated for pre-deploy execution. -// The annotationValue parameter specifies the value that the deploy annotation must match. -func IsPreDeployJob(obj *unstructured.Unstructured, annotationValue string) bool { - if obj.GroupVersionKind().GroupKind() != jobGK { - return false - } - - annotations := obj.GetAnnotations() - if annotations == nil { - return false - } - - return annotations[deployFilterAnnotation] == annotationValue -} - -// IsOptionalPreDeployJob return true if the object has the optional annotation set to "true". -// Optional pre-deploy jobs do not block the deploy pipeline if they fail. -func IsOptionalPreDeployJob(obj *unstructured.Unstructured) bool { - annotations := obj.GetAnnotations() - if annotations == nil { - return false - } - - return annotations[preDeployOptionalAnnotation] == "true" -} - -// ExtractPreDeployJobs separates pre-deploy jobs from the resources list, returning -// the pre-deploy jobs and the remaining resources separately. The annotationValue parameter -// specifies the value that the deploy annotation must match to identify a pre-deploy job. -func ExtractPreDeployJobs(resources []*unstructured.Unstructured, annotationValue string) ([]*unstructured.Unstructured, []*unstructured.Unstructured) { - preDeployJobs := make([]*unstructured.Unstructured, 0) - remaining := make([]*unstructured.Unstructured, 0, len(resources)) - - for _, res := range resources { - if IsPreDeployJob(res, annotationValue) { - preDeployJobs = append(preDeployJobs, res) - } else { - remaining = append(remaining, res) - } - } - - return preDeployJobs, remaining -} - -// keep it to always check if preDeployFilter implement correctly the filter.Interface interface -var _ filter.Interface = &preDeployFilter{} diff --git a/pkg/extensions/predeployfilter_test.go b/pkg/extensions/predeployfilter_test.go deleted file mode 100644 index 4149332..0000000 --- a/pkg/extensions/predeployfilter_test.go +++ /dev/null @@ -1,292 +0,0 @@ -// Copyright Mia srl -// SPDX-License-Identifier: Apache-2.0 -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -package extensions - -import ( - "path/filepath" - "testing" - - jpltesting "github.com/mia-platform/jpl/pkg/testing" - "github.com/stretchr/testify/assert" - "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" -) - -func TestPreDeployFilter(t *testing.T) { - t.Parallel() - - testdata := filepath.Join("testdata", "filter") - preDeployJob := jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "pre-deploy-job.yaml")) - regularJob := jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "regular-job.yaml")) - - tests := map[string]struct { - object *unstructured.Unstructured - annotationValue string - expected bool - }{ - "no filtering for deployment": { - object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "deployment.yaml")), - annotationValue: PreDeployFilterDefaultValue, - expected: false, - }, - "no filtering for configmap": { - object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "configmap.yaml")), - annotationValue: PreDeployFilterDefaultValue, - expected: false, - }, - "no filtering for secret": { - object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "secret.yaml")), - annotationValue: PreDeployFilterDefaultValue, - expected: false, - }, - "no filtering for job without pre-deploy annotation": { - object: regularJob, - annotationValue: PreDeployFilterDefaultValue, - expected: false, - }, - "filtering for job with pre-deploy annotation": { - object: preDeployJob, - annotationValue: PreDeployFilterDefaultValue, - expected: true, - }, - "no filtering when annotation value does not match": { - object: preDeployJob, - annotationValue: "custom-value", - expected: false, - }, - } - - for name, test := range tests { - t.Run(name, func(t *testing.T) { - t.Parallel() - - filter := NewPreDeployFilter(test.annotationValue) - filtered, err := filter.Filter(test.object, nil) - assert.NoError(t, err) - assert.Equal(t, test.expected, filtered) - }) - } -} - -func TestIsPreDeployJob(t *testing.T) { - t.Parallel() - - testdata := filepath.Join("testdata", "filter") - - tests := map[string]struct { - object *unstructured.Unstructured - annotationValue string - expected bool - }{ - "deployment is not a pre-deploy job": { - object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "deployment.yaml")), - annotationValue: PreDeployFilterDefaultValue, - expected: false, - }, - "regular job is not a pre-deploy job": { - object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "regular-job.yaml")), - annotationValue: PreDeployFilterDefaultValue, - expected: false, - }, - "job with pre-deploy annotation is a pre-deploy job": { - object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "pre-deploy-job.yaml")), - annotationValue: PreDeployFilterDefaultValue, - expected: true, - }, - "job with custom annotation value matches": { - object: &unstructured.Unstructured{ - Object: map[string]interface{}{ - "apiVersion": "batch/v1", - "kind": "Job", - "metadata": map[string]interface{}{ - "name": "custom-job", - "annotations": map[string]interface{}{ - "mia-platform.eu/deploy": "custom-pre-deploy", - }, - }, - }, - }, - annotationValue: "custom-pre-deploy", - expected: true, - }, - "job with default annotation does not match custom value": { - object: jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "pre-deploy-job.yaml")), - annotationValue: "custom-pre-deploy", - expected: false, - }, - } - - for name, test := range tests { - t.Run(name, func(t *testing.T) { - t.Parallel() - - assert.Equal(t, test.expected, IsPreDeployJob(test.object, test.annotationValue)) - }) - } -} - -func TestIsOptionalPreDeployJob(t *testing.T) { - t.Parallel() - - tests := map[string]struct { - object *unstructured.Unstructured - expected bool - }{ - "job without optional annotation is not optional": { - object: &unstructured.Unstructured{ - Object: map[string]interface{}{ - "apiVersion": "batch/v1", - "kind": "Job", - "metadata": map[string]interface{}{ - "name": "migration", - "annotations": map[string]interface{}{ - "mia-platform.eu/deploy": "pre-deploy", - }, - }, - }, - }, - expected: false, - }, - "job with optional annotation set to true is optional": { - object: &unstructured.Unstructured{ - Object: map[string]interface{}{ - "apiVersion": "batch/v1", - "kind": "Job", - "metadata": map[string]interface{}{ - "name": "migration", - "annotations": map[string]interface{}{ - "mia-platform.eu/deploy": "pre-deploy", - "mia-platform.eu/deploy-optional": "true", - }, - }, - }, - }, - expected: true, - }, - "job with optional annotation set to false is not optional": { - object: &unstructured.Unstructured{ - Object: map[string]interface{}{ - "apiVersion": "batch/v1", - "kind": "Job", - "metadata": map[string]interface{}{ - "name": "migration", - "annotations": map[string]interface{}{ - "mia-platform.eu/deploy": "pre-deploy", - "mia-platform.eu/deploy-optional": "false", - }, - }, - }, - }, - expected: false, - }, - "job without annotations is not optional": { - object: &unstructured.Unstructured{ - Object: map[string]interface{}{ - "apiVersion": "batch/v1", - "kind": "Job", - "metadata": map[string]interface{}{ - "name": "migration", - }, - }, - }, - expected: false, - }, - } - - for name, test := range tests { - t.Run(name, func(t *testing.T) { - t.Parallel() - - assert.Equal(t, test.expected, IsOptionalPreDeployJob(test.object)) - }) - } -} - -func TestExtractPreDeployJobs(t *testing.T) { - t.Parallel() - - testdata := filepath.Join("testdata", "filter") - preDeployJob := jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "pre-deploy-job.yaml")) - regularJob := jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "regular-job.yaml")) - deployment := jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "deployment.yaml")) - configmap := jpltesting.UnstructuredFromFile(t, filepath.Join(testdata, "configmap.yaml")) - - customJob := &unstructured.Unstructured{ - Object: map[string]interface{}{ - "apiVersion": "batch/v1", - "kind": "Job", - "metadata": map[string]interface{}{ - "name": "custom-job", - "annotations": map[string]interface{}{ - "mia-platform.eu/deploy": "custom-value", - }, - }, - }, - } - - tests := map[string]struct { - resources []*unstructured.Unstructured - annotationValue string - expectedPreDeployJobs int - expectedRemaining int - }{ - "empty resources": { - resources: []*unstructured.Unstructured{}, - annotationValue: PreDeployFilterDefaultValue, - expectedPreDeployJobs: 0, - expectedRemaining: 0, - }, - "no pre-deploy jobs": { - resources: []*unstructured.Unstructured{deployment, configmap, regularJob}, - annotationValue: PreDeployFilterDefaultValue, - expectedPreDeployJobs: 0, - expectedRemaining: 3, - }, - "only pre-deploy jobs": { - resources: []*unstructured.Unstructured{preDeployJob}, - annotationValue: PreDeployFilterDefaultValue, - expectedPreDeployJobs: 1, - expectedRemaining: 0, - }, - "mixed resources with pre-deploy jobs": { - resources: []*unstructured.Unstructured{deployment, preDeployJob, configmap, regularJob}, - annotationValue: PreDeployFilterDefaultValue, - expectedPreDeployJobs: 1, - expectedRemaining: 3, - }, - "custom annotation value extracts correct jobs": { - resources: []*unstructured.Unstructured{deployment, preDeployJob, customJob, regularJob}, - annotationValue: "custom-value", - expectedPreDeployJobs: 1, - expectedRemaining: 3, - }, - "default annotation value does not match custom jobs": { - resources: []*unstructured.Unstructured{customJob, regularJob}, - annotationValue: PreDeployFilterDefaultValue, - expectedPreDeployJobs: 0, - expectedRemaining: 2, - }, - } - - for name, test := range tests { - t.Run(name, func(t *testing.T) { - t.Parallel() - - preDeployJobs, remaining := ExtractPreDeployJobs(test.resources, test.annotationValue) - assert.Len(t, preDeployJobs, test.expectedPreDeployJobs) - assert.Len(t, remaining, test.expectedRemaining) - }) - } -} From 3f65949fe1b83cb71a0431f402ce3ab0eeac5037 Mon Sep 17 00:00:00 2001 From: Davide Chiarelli Date: Fri, 6 Mar 2026 12:56:50 +0100 Subject: [PATCH 4/7] feat(pre-deploy): enhance pre-deploy job handling with optional jobs and improved error reporting --- docs/60_pre_deploy.md | 127 +++++++----------- pkg/cmd/deploy/predeployjob.go | 43 +++--- pkg/cmd/deploy/predeployjob_test.go | 127 +++++++++++++++++- .../pre-deploy-job-optional-failed.yaml | 21 +++ .../pre-deploy/pre-deploy-job-optional.yaml | 19 +++ 5 files changed, 229 insertions(+), 108 deletions(-) create mode 100644 pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml create mode 100644 pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml diff --git a/docs/60_pre_deploy.md b/docs/60_pre_deploy.md index 7ca6825..662ccee 100644 --- a/docs/60_pre_deploy.md +++ b/docs/60_pre_deploy.md @@ -1,121 +1,86 @@ # Pre-Deploy Jobs -The `deploy` command now supports executing Jobs marked as pre-deploy jobs before the main deployment pipeline. -This feature allows you to run initialization or validation tasks (such as database migrations, secret provisioning, -or health checks) prior to deploying your main workloads. +`mlp` supports the execution of Kubernetes `Job` resources before the main deploy phase. These pre-deploy jobs +can be used to run database migrations, data transformations, or any other task that must complete successfully +before the rest of the resources are applied to the cluster. -## Identifying Pre-Deploy Jobs +## How It Works -A Job is identified as a pre-deploy job by annotating it with the `mia-platform.eu/deploy` annotation and setting -its value to `pre-deploy`. This annotation value can be customized via the `--pre-deploy-job-annotation` CLI flag. +Pre-deploy jobs are identified by the annotation `mia-platform.eu/deploy` on a `Job` resource. When the +`--pre-deploy-job-annotation` flag is provided to the `deploy` command, `mlp` will scan the resources and +separate all `Job` resources whose annotation value matches the one provided. -Example Job manifest: +These jobs are executed before the remaining resources are applied. If the flag is not provided, any `Job` +resource carrying the `mia-platform.eu/deploy` annotation will be stripped from the resource list and not +applied at all. + +## Annotating a Job + +To mark a `Job` as a pre-deploy job, add the `mia-platform.eu/deploy` annotation with the desired value: ```yaml apiVersion: batch/v1 kind: Job metadata: - name: db-migration-job + name: db-migration annotations: mia-platform.eu/deploy: pre-deploy spec: template: spec: - containers: - - name: migrator - image: migrations:latest restartPolicy: Never + containers: + - name: migrate + image: my-app:latest + command: ["./migrate"] ``` -## Pre-Deploy Job Execution Flow - -When the `deploy` command runs, it performs the following steps: - -1. **Extraction**: Pre-deploy jobs are extracted from the resource list based on the configured annotation. -2. **Execution**: Each pre-deploy job is applied to the cluster in sequence. -3. **Monitoring**: The command waits for each job to complete with a configurable timeout. -4. **Retry Logic**: If a job fails, the command automatically retries the execution up to a maximum number of attempts. -5. **Main Deployment**: After all pre-deploy jobs successfully complete, the main deployment pipeline proceeds. -6. **Filtering**: Pre-deploy jobs are filtered out and do not participate in the main deployment phase. - -## Configuration - -The behavior of pre-deploy jobs can be configured via the following CLI flags: - -### `--pre-deploy-job-timeout` - -The maximum duration to wait for a single pre-deploy job to complete. - -- **Type**: Duration -- **Default**: `30s` -- **Usage**: `mlp deploy --pre-deploy-job-timeout 2m` - -Non-zero values should contain a corresponding time unit (e.g., `1s`, `2m`, `3h`). - -### `--pre-deploy-job-max-retries` +Then pass the matching value to the deploy command: -The maximum number of attempts for executing a pre-deploy job before aborting the deploy. - -- **Type**: Integer -- **Default**: `3` -- **Usage**: `mlp deploy --pre-deploy-job-max-retries 5` - -### `--pre-deploy-job-annotation` - -The value of the `mia-platform.eu/deploy` annotation used to identify pre-deploy jobs. -This allows you to customize the annotation value used to mark jobs as pre-deploy jobs. - -- **Type**: String -- **Default**: `pre-deploy` -- **Usage**: `mlp deploy --pre-deploy-job-annotation init-job` - -## Optional Pre-Deploy Jobs +```sh +mlp deploy --pre-deploy-job-annotation pre-deploy ... +``` -Pre-deploy jobs can be marked as optional using the `mia-platform.eu/deploy-optional` annotation set to `true`. -Optional jobs that fail will log a warning but will not block the main deployment pipeline. +## Optional Jobs -Example: +A pre-deploy job can be marked as optional by adding the annotation `mia-platform.eu/deploy-optional: "true"`. +Optional jobs are non-blocking: if they fail, the failure is logged as a warning and the deploy process +continues normally. Mandatory jobs (those without the optional annotation) will block and fail the deploy if +they cannot complete successfully. ```yaml apiVersion: batch/v1 kind: Job metadata: - name: optional-config-job + name: optional-cleanup annotations: mia-platform.eu/deploy: pre-deploy mia-platform.eu/deploy-optional: "true" spec: template: spec: - containers: - - name: config - image: config-tool:latest restartPolicy: Never + containers: + - name: cleanup + image: my-app:latest + command: ["./cleanup"] ``` -## Error Handling - -### Mandatory Pre-Deploy Job Failure - -If a mandatory pre-deploy job fails after all retry attempts are exhausted, the deploy process is aborted -and an error is returned. This ensures that critical initialization tasks are completed before proceeding -with the main deployment. - -### Optional Pre-Deploy Job Failure - -If an optional pre-deploy job fails, a warning is logged and the deploy process continues. -This allows for graceful handling of non-critical pre-deployment tasks. - -## Dry-Run Mode +## Retry and Timeout -When using the `--dry-run` flag, pre-deploy jobs are applied to the cluster with the dry-run option enabled. -In this mode, job status polling is skipped, and the command does not wait for job completion. +Each pre-deploy job is retried automatically on failure. Before each retry the failed job is deleted from +the cluster so that a fresh instance can be created. The number of retries and the per-execution timeout +can be controlled via dedicated flags: -## Polling Behavior +| Flag | Default | Description | +|---|---|---| +| `--pre-deploy-job-annotation` | _(empty)_ | Annotation value used to identify pre-deploy jobs | +| `--pre-deploy-job-max-retries` | `3` | Maximum number of retry attempts for a failed job | +| `--pre-deploy-job-timeout` | `30s` | Timeout for a single job execution attempt | -The command polls the status of each pre-deploy job every 5 seconds until: +If a job exceeds the configured timeout it is considered failed and the retry logic applies as normal. -- The job completes successfully -- The job exceeds the configured timeout -- The maximum number of retries is reached +## Dry Run +When the `--dry-run` flag is active, no jobs are created on the cluster. Instead, `mlp` will print a message +for each job that would have been executed, allowing you to verify the configuration without side effects. diff --git a/pkg/cmd/deploy/predeployjob.go b/pkg/cmd/deploy/predeployjob.go index 7f650c0..d102b41 100644 --- a/pkg/cmd/deploy/predeployjob.go +++ b/pkg/cmd/deploy/predeployjob.go @@ -19,7 +19,6 @@ import ( "context" "fmt" "io" - "strings" "time" "github.com/go-logr/logr" @@ -32,8 +31,9 @@ import ( ) const ( - preDeployAnnotation = "mia-platform.eu/deploy" - jobKind = "Job" + preDeployAnnotation = "mia-platform.eu/deploy" + deployOptionalAnnotation = "mia-platform.eu/deploy-optional" + jobKind = "Job" ) // PreDeployJobRunner handles the creation, execution and monitoring of pre-deploy jobs @@ -101,9 +101,16 @@ func StripAnnotatedJobs(resources []*unstructured.Unstructured) []*unstructured. return remaining } +// isOptionalPreDeployJob reports whether the job carries the deploy-optional annotation set to "true". +func isOptionalPreDeployJob(job *unstructured.Unstructured) bool { + return job.GetAnnotations()[deployOptionalAnnotation] == "true" +} + // Run executes all pre-deploy jobs with retry and timeout support. Each job is retried -// up to maxRetries times upon failure. An error is returned only if ALL annotated jobs fail; -// if at least one succeeds, it returns nil and the deploy process can continue. +// up to maxRetries times upon failure. Jobs annotated with mia-platform.eu/deploy-optional=true +// are treated as non-blocking: their failure is logged as a warning but never counted as an +// error. For mandatory jobs an error is returned only if ALL of them fail; if at least one +// mandatory job succeeds, the deploy process can continue. func (r *PreDeployJobRunner) Run(ctx context.Context, jobs []*unstructured.Unstructured) error { logger := logr.FromContextOrDiscard(ctx) @@ -121,31 +128,25 @@ func (r *PreDeployJobRunner) Run(ctx context.Context, jobs []*unstructured.Unstr logger.V(3).Info("starting pre-deploy jobs", "count", len(jobs)) - var failedJobs []string - successCount := 0 - for _, job := range jobs { jobName := job.GetName() + optional := isOptionalPreDeployJob(job) + err := r.runJobWithRetries(ctx, job) if err != nil { - fmt.Fprintf(r.writer, "pre-deploy job %q failed after %d retries: %s\n", jobName, r.maxRetries, err) - failedJobs = append(failedJobs, jobName) + if optional { + logger.V(3).Info("optional pre-deploy job failed, continuing", "name", jobName, "error", err) + fmt.Fprintf(r.writer, "optional pre-deploy job %q failed, continuing\n", jobName) + } else { + logger.V(3).Info("pre-deploy job failed", "name", jobName, "error", err) + return fmt.Errorf("pre-deploy job %q failed after %d attempts: %w", jobName, r.maxRetries, err) + } } else { fmt.Fprintf(r.writer, "pre-deploy job %q completed successfully\n", jobName) - successCount++ - } - } - - if successCount == 0 { - builder := new(strings.Builder) - fmt.Fprintf(builder, "all %d pre-deploy job(s) failed:\n", len(failedJobs)) - for _, name := range failedJobs { - fmt.Fprintf(builder, "\t- %s\n", name) } - return fmt.Errorf("%s", builder.String()) } - logger.V(3).Info("pre-deploy jobs completed", "succeeded", successCount, "failed", len(failedJobs)) + logger.V(3).Info("pre-deploy jobs completed") return nil } diff --git a/pkg/cmd/deploy/predeployjob_test.go b/pkg/cmd/deploy/predeployjob_test.go index 7a829a6..ebc2fec 100644 --- a/pkg/cmd/deploy/predeployjob_test.go +++ b/pkg/cmd/deploy/predeployjob_test.go @@ -67,6 +67,15 @@ func newTestJob(name, annotationValue string) *unstructured.Unstructured { return obj } +// newOptionalTestJob creates a pre-deploy Job with the deploy-optional annotation set to "true". +func newOptionalTestJob(name string) *unstructured.Unstructured { + job := newTestJob(name, "pre-deploy") + annotations := job.GetAnnotations() + annotations[deployOptionalAnnotation] = "true" + job.SetAnnotations(annotations) + return job +} + func TestFilterPreDeployJobs(t *testing.T) { t.Parallel() @@ -311,9 +320,9 @@ func TestPreDeployJobRunnerRun(t *testing.T) { }) }, maxRetries: 1, - expectedError: "all 1 pre-deploy job(s) failed", + expectedError: "pre-deploy job \"fail-job\" failed after 1 attempts:", }, - "partial success continues deploy": { + "partial success stops at first failure": { jobs: []*unstructured.Unstructured{ newTestJob("success-job", "pre-deploy"), newTestJob("fail-job", "pre-deploy"), @@ -342,10 +351,11 @@ func TestPreDeployJobRunnerRun(t *testing.T) { }, nil }) }, - maxRetries: 0, - expectedOut: "completed successfully", + maxRetries: 0, + expectedOut: "completed successfully", + expectedError: "pre-deploy job \"fail-job\" failed after 0 attempts:", }, - "multiple jobs all fail": { + "multiple jobs stops at first failure": { jobs: []*unstructured.Unstructured{ newTestJob("fail-job-1", "pre-deploy"), newTestJob("fail-job-2", "pre-deploy"), @@ -366,7 +376,112 @@ func TestPreDeployJobRunnerRun(t *testing.T) { }) }, maxRetries: 0, - expectedError: "all 2 pre-deploy job(s) failed", + expectedError: "pre-deploy job \"fail-job-1\" failed after 0 attempts:", + }, + "optional job failure does not block deploy": { + jobs: []*unstructured.Unstructured{newOptionalTestJob("optional-job")}, + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{ + Name: action.(k8stesting.GetAction).GetName(), + Namespace: namespace, + }, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobFailed, Status: corev1.ConditionTrue, Message: "migration failed"}, + }, + }, + }, nil + }) + }, + maxRetries: 0, + expectedOut: "optional pre-deploy job", + }, + "optional job failure with mandatory success": { + jobs: []*unstructured.Unstructured{ + newTestJob("mandatory-job", "pre-deploy"), + newOptionalTestJob("optional-job"), + }, + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + name := action.(k8stesting.GetAction).GetName() + if name == "mandatory-job" { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: name, Namespace: namespace}, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobComplete, Status: corev1.ConditionTrue}, + }, + }, + }, nil + } + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: name, Namespace: namespace}, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobFailed, Status: corev1.ConditionTrue, Message: "optional failed"}, + }, + }, + }, nil + }) + }, + maxRetries: 0, + expectedOut: "completed successfully", + }, + "mandatory job fails while optional succeeds": { + jobs: []*unstructured.Unstructured{ + newTestJob("mandatory-job", "pre-deploy"), + newOptionalTestJob("optional-job"), + }, + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + name := action.(k8stesting.GetAction).GetName() + if name == "optional-job" { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: name, Namespace: namespace}, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobComplete, Status: corev1.ConditionTrue}, + }, + }, + }, nil + } + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: name, Namespace: namespace}, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobFailed, Status: corev1.ConditionTrue, Message: "mandatory failed"}, + }, + }, + }, nil + }) + }, + maxRetries: 0, + expectedError: "pre-deploy job \"mandatory-job\" failed after 0 attempts:", + }, + "all optional jobs fail": { + jobs: []*unstructured.Unstructured{ + newOptionalTestJob("optional-job-1"), + newOptionalTestJob("optional-job-2"), + }, + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{ + Name: action.(k8stesting.GetAction).GetName(), + Namespace: namespace, + }, + Status: batchv1.JobStatus{ + Conditions: []batchv1.JobCondition{ + {Type: batchv1.JobFailed, Status: corev1.ConditionTrue, Message: "failed"}, + }, + }, + }, nil + }) + }, + maxRetries: 0, + expectedOut: "optional pre-deploy job", }, } diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml new file mode 100644 index 0000000..f5f271d --- /dev/null +++ b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml @@ -0,0 +1,21 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: pre-deploy-migration-fail + annotations: + mia-platform.eu/deploy: pre-deploy + mia-platform.eu/deploy-optional: 'true' +spec: + # backoffLimit is also enforced to 0 by mlp at apply time + backoffLimit: 0 + template: + metadata: {} + spec: + containers: + - name: migration + image: busybox + args: + - /bin/sh + - -c + - echo "migration failed!" && exit 1 + restartPolicy: Never diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml new file mode 100644 index 0000000..379d965 --- /dev/null +++ b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml @@ -0,0 +1,19 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: pre-deploy-optional + annotations: + mia-platform.eu/deploy: pre-deploy + mia-platform.eu/deploy-optional: 'true' +spec: + template: + metadata: {} + spec: + containers: + - name: migration + image: busybox + args: + - /bin/sh + - '-c' + - echo "running migration && exit $(( RANDOM % 2 ))" + restartPolicy: Never From da34c9e1d8d4d0d8f5563a3cbd0f0b9e3009adf3 Mon Sep 17 00:00:00 2001 From: Davide Chiarelli Date: Fri, 6 Mar 2026 16:34:15 +0100 Subject: [PATCH 5/7] fix(pre-deploy): update runPreDeployPhase logic to handle empty job matches correctly (skip all resources) --- pkg/cmd/deploy/deploy.go | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/pkg/cmd/deploy/deploy.go b/pkg/cmd/deploy/deploy.go index b78cb26..fe5f4c2 100644 --- a/pkg/cmd/deploy/deploy.go +++ b/pkg/cmd/deploy/deploy.go @@ -346,8 +346,10 @@ func formatApplyErrors(errs []error) string { // runPreDeployPhase handles pre-deploy job filtering and execution. When the pre-deploy // annotation flag is set, matching jobs are extracted and executed; if any are found the -// caller should stop the normal apply phase (stop=true). When the flag is absent, annotated -// jobs are stripped from the resource list so they are never applied as regular resources. +// caller should stop the normal apply phase (stop=true). If the flag is set but no jobs +// match the filter, nothing is applied (stop=true with empty resources). When the flag is +// absent, annotated jobs are stripped from the resource list so they are never applied as +// regular resources. func (o *Options) runPreDeployPhase(ctx context.Context, namespace string, resources []*unstructured.Unstructured) ([]*unstructured.Unstructured, bool, error) { if o.preDeployJobAnnotation == "" { return StripAnnotatedJobs(resources), false, nil @@ -355,7 +357,7 @@ func (o *Options) runPreDeployPhase(ctx context.Context, namespace string, resou preDeployJobs, remaining := FilterPreDeployJobs(resources, o.preDeployJobAnnotation) if len(preDeployJobs) == 0 { - return resources, false, nil + return nil, true, nil } clientSet, err := o.clientFactory.KubernetesClientSet() From 3549a0f175f8f1e9654ff7170dee2dec6c81f639 Mon Sep 17 00:00:00 2001 From: Davide Chiarelli Date: Fri, 6 Mar 2026 18:15:29 +0100 Subject: [PATCH 6/7] fix(pre-deploy): improve job completion error handling and add regression test for timeout scenarios --- pkg/cmd/deploy/predeployjob.go | 5 ++++- pkg/cmd/deploy/predeployjob_test.go | 18 ++++++++++++++++++ .../pre-deploy/pre-deploy-job-sleep.yaml | 18 ++++++++++++++++++ 3 files changed, 40 insertions(+), 1 deletion(-) create mode 100644 pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-sleep.yaml diff --git a/pkg/cmd/deploy/predeployjob.go b/pkg/cmd/deploy/predeployjob.go index d102b41..cbcac9b 100644 --- a/pkg/cmd/deploy/predeployjob.go +++ b/pkg/cmd/deploy/predeployjob.go @@ -219,7 +219,10 @@ func (r *PreDeployJobRunner) waitForJobCompletion(ctx context.Context, name stri case <-timeoutCtx.Done(): return fmt.Errorf("job %q timed out after %s", name, r.timeout) case <-ticker.C: - job, err := r.clientSet.BatchV1().Jobs(r.namespace).Get(timeoutCtx, name, metav1.GetOptions{}) + if timeoutCtx.Err() != nil { + return fmt.Errorf("job %q timed out after %s", name, r.timeout) + } + job, err := r.clientSet.BatchV1().Jobs(r.namespace).Get(ctx, name, metav1.GetOptions{}) if err != nil { return fmt.Errorf("failed to get job %q status: %w", name, err) } diff --git a/pkg/cmd/deploy/predeployjob_test.go b/pkg/cmd/deploy/predeployjob_test.go index ebc2fec..e3e16cc 100644 --- a/pkg/cmd/deploy/predeployjob_test.go +++ b/pkg/cmd/deploy/predeployjob_test.go @@ -673,6 +673,24 @@ func TestWaitForJobCompletion(t *testing.T) { timeout: 5 * time.Second, expectedError: `failed to get job "test-job" status`, }, + // Regression test: when ticker.C and timeoutCtx.Done() fire simultaneously, Go's + // non-deterministic select may pick the ticker case. The reactor used here sleeps + // well past the timeout to guarantee the context is already cancelled by the time + // Get is called. Without the fix the error would be "failed to get job status: + // context deadline exceeded" rather than the canonical "timed out" message. + "timeout error is reported correctly when ticker races with timeout": { + setupReactor: func(cs *kubefake.Clientset) { + cs.PrependReactor("get", "jobs", func(action k8stesting.Action) (bool, runtime.Object, error) { + // sleep long enough to ensure the timeout fires before or during Get + time.Sleep(200 * time.Millisecond) + return true, &batchv1.Job{ + ObjectMeta: metav1.ObjectMeta{Name: "test-job", Namespace: namespace}, + }, nil + }) + }, + timeout: 10 * time.Millisecond, + expectedError: `job "test-job" timed out`, + }, } for name, test := range tests { diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-sleep.yaml b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-sleep.yaml new file mode 100644 index 0000000..d617225 --- /dev/null +++ b/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-sleep.yaml @@ -0,0 +1,18 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: pre-deploy-sleep + annotations: + mia-platform.eu/deploy: pre-deploy +spec: + template: + metadata: {} + spec: + containers: + - name: migration + image: busybox + args: + - /bin/sh + - -c + - echo 'Inizio...' && sleep 2 && echo 'Fine attesa!' + restartPolicy: Never From 839e1b67acd7be69b90906a27b78e7cd9992deed Mon Sep 17 00:00:00 2001 From: Davide Chiarelli Date: Mon, 9 Mar 2026 17:09:52 +0100 Subject: [PATCH 7/7] refactor(pre-deploy): renamed all reference from predeploy job to filtered job, to make this feature more extendible --- docs/10_overview.md | 2 +- docs/{60_pre_deploy.md => 60_filtered_job.md} | 22 ++-- pkg/cmd/deploy/deploy.go | 102 +++++++++--------- pkg/cmd/deploy/deploy_test.go | 30 +++--- .../{predeployjob.go => filteredjob.go} | 80 +++++++------- ...edeployjob_test.go => filteredjob_test.go} | 38 +++---- .../configmap.yaml | 0 .../deployment.yaml | 0 .../filtered-job-failed.yaml} | 2 +- .../filtered-job-optional-failed.yaml} | 2 +- .../filtered-job-optional.yaml} | 2 +- .../filtered-job-sleep.yaml} | 2 +- .../filtered-job.yaml} | 2 +- ...{pre-deploy-job.yaml => filtered-job.yaml} | 2 +- 14 files changed, 143 insertions(+), 143 deletions(-) rename docs/{60_pre_deploy.md => 60_filtered_job.md} (69%) rename pkg/cmd/deploy/{predeployjob.go => filteredjob.go} (65%) rename pkg/cmd/deploy/{predeployjob_test.go => filteredjob_test.go} (94%) rename pkg/cmd/deploy/testdata/{pre-deploy => filtered-job}/configmap.yaml (100%) rename pkg/cmd/deploy/testdata/{pre-deploy => filtered-job}/deployment.yaml (100%) rename pkg/cmd/deploy/testdata/{pre-deploy/pre-deploy-job-failed.yaml => filtered-job/filtered-job-failed.yaml} (92%) rename pkg/cmd/deploy/testdata/{pre-deploy/pre-deploy-job-optional-failed.yaml => filtered-job/filtered-job-optional-failed.yaml} (92%) rename pkg/cmd/deploy/testdata/{pre-deploy/pre-deploy-job-optional.yaml => filtered-job/filtered-job-optional.yaml} (93%) rename pkg/cmd/deploy/testdata/{pre-deploy/pre-deploy-job-sleep.yaml => filtered-job/filtered-job-sleep.yaml} (93%) rename pkg/cmd/deploy/testdata/{pre-deploy/pre-deploy-job.yaml => filtered-job/filtered-job.yaml} (91%) rename pkg/extensions/testdata/filter/{pre-deploy-job.yaml => filtered-job.yaml} (91%) diff --git a/docs/10_overview.md b/docs/10_overview.md index cf1fc9a..589a762 100644 --- a/docs/10_overview.md +++ b/docs/10_overview.md @@ -39,4 +39,4 @@ Below, you can find additional documentation for `mlp`: - [Generation Configuration](./30_generate.md) - [Hydration Logic](./40_hydrate.md) - [Interpolatation Template](./50_interpolate.md) -- [Pre-Deploy Jobs](./60_pre_deploy.md) +- [Filtered Jobs](./60_filtered_job.md) diff --git a/docs/60_pre_deploy.md b/docs/60_filtered_job.md similarity index 69% rename from docs/60_pre_deploy.md rename to docs/60_filtered_job.md index 662ccee..55dc6ad 100644 --- a/docs/60_pre_deploy.md +++ b/docs/60_filtered_job.md @@ -1,13 +1,13 @@ -# Pre-Deploy Jobs +# Filtered Jobs -`mlp` supports the execution of Kubernetes `Job` resources before the main deploy phase. These pre-deploy jobs +`mlp` supports the execution of Kubernetes `Job` resources before the main deploy phase. These filtered jobs can be used to run database migrations, data transformations, or any other task that must complete successfully before the rest of the resources are applied to the cluster. ## How It Works -Pre-deploy jobs are identified by the annotation `mia-platform.eu/deploy` on a `Job` resource. When the -`--pre-deploy-job-annotation` flag is provided to the `deploy` command, `mlp` will scan the resources and +Filtered jobs are identified by the annotation `mia-platform.eu/deploy` on a `Job` resource. When the +`--filtered-job-annotation` flag is provided to the `deploy` command, `mlp` will scan the resources and separate all `Job` resources whose annotation value matches the one provided. These jobs are executed before the remaining resources are applied. If the flag is not provided, any `Job` @@ -16,7 +16,7 @@ applied at all. ## Annotating a Job -To mark a `Job` as a pre-deploy job, add the `mia-platform.eu/deploy` annotation with the desired value: +To mark a `Job` as a filtered job, add the `mia-platform.eu/deploy` annotation with the desired value: ```yaml apiVersion: batch/v1 @@ -38,12 +38,12 @@ spec: Then pass the matching value to the deploy command: ```sh -mlp deploy --pre-deploy-job-annotation pre-deploy ... +mlp deploy --filtered-job-annotation pre-deploy ... ``` ## Optional Jobs -A pre-deploy job can be marked as optional by adding the annotation `mia-platform.eu/deploy-optional: "true"`. +A filtered job can be marked as optional by adding the annotation `mia-platform.eu/deploy-optional: "true"`. Optional jobs are non-blocking: if they fail, the failure is logged as a warning and the deploy process continues normally. Mandatory jobs (those without the optional annotation) will block and fail the deploy if they cannot complete successfully. @@ -68,15 +68,15 @@ spec: ## Retry and Timeout -Each pre-deploy job is retried automatically on failure. Before each retry the failed job is deleted from +Each filtered job is retried automatically on failure. Before each retry the failed job is deleted from the cluster so that a fresh instance can be created. The number of retries and the per-execution timeout can be controlled via dedicated flags: | Flag | Default | Description | |---|---|---| -| `--pre-deploy-job-annotation` | _(empty)_ | Annotation value used to identify pre-deploy jobs | -| `--pre-deploy-job-max-retries` | `3` | Maximum number of retry attempts for a failed job | -| `--pre-deploy-job-timeout` | `30s` | Timeout for a single job execution attempt | +| `--filtered-job-annotation` | _(empty)_ | Annotation value used to identify filtered jobs | +| `--filtered-job-max-retries` | `3` | Maximum number of retry attempts for a failed job | +| `--filtered-job-timeout` | `30s` | Timeout for a single job execution attempt | If a job exceeds the configured timeout it is considered failed and the retry logic applies as normal. diff --git a/pkg/cmd/deploy/deploy.go b/pkg/cmd/deploy/deploy.go index fe5f4c2..3564539 100644 --- a/pkg/cmd/deploy/deploy.go +++ b/pkg/cmd/deploy/deploy.go @@ -82,16 +82,16 @@ const ( waitFlagDefaultValue = true waitFlagUsage = "if true, wait for resources to be current before marking them as successfully applied" - preDeployJobAnnotationFlagName = "pre-deploy-job-annotation" - preDeployJobAnnotationFlagUsage = "the annotation value for mia-platform.eu/deploy to identify pre-deploy jobs" + filteredJobAnnotationFlagName = "filtered-job-annotation" + filteredJobAnnotationFlagUsage = "the annotation value for mia-platform.eu/deploy to identify filtered jobs" - preDeployJobMaxRetriesFlagName = "pre-deploy-job-max-retries" - preDeployJobMaxRetriesDefaultValue = 3 - preDeployJobMaxRetriesFlagUsage = "the maximum number of retries for a failed pre-deploy job" + filteredJobMaxRetriesFlagName = "filtered-job-max-retries" + filteredJobMaxRetriesDefaultValue = 3 + filteredJobMaxRetriesFlagUsage = "the maximum number of retries for a failed filtered job" - preDeployJobTimeoutFlagName = "pre-deploy-job-timeout" - preDeployJobTimeoutDefaultValue = 30 * time.Second - preDeployJobTimeoutFlagUsage = "the timeout for a single pre-deploy job execution" + filteredJobTimeoutFlagName = "filtered-job-timeout" + filteredJobTimeoutDefaultValue = 30 * time.Second + filteredJobTimeoutFlagUsage = "the timeout for a single filtered job execution" stdinToken = "-" fieldManager = "mlp" @@ -108,31 +108,31 @@ var ( // Flags contains all the flags for the `deploy` command. They will be converted to Options // that contains all runtime options for the command. type Flags struct { - ConfigFlags *genericclioptions.ConfigFlags - inputPaths []string - deployType string - forceDeploy bool - ensureNamespace bool - timeout time.Duration - dryRun bool - wait bool - preDeployJobAnnotation string - preDeployJobMaxRetries int - preDeployJobTimeout time.Duration + ConfigFlags *genericclioptions.ConfigFlags + inputPaths []string + deployType string + forceDeploy bool + ensureNamespace bool + timeout time.Duration + dryRun bool + wait bool + filteredJobAnnotation string + filteredJobMaxRetries int + filteredJobTimeout time.Duration } // Options have the data required to perform the deploy operation type Options struct { - inputPaths []string - deployType string - forceDeploy bool - ensureNamespace bool - timeout time.Duration - dryRun bool - wait bool - preDeployJobAnnotation string - preDeployJobMaxRetries int - preDeployJobTimeout time.Duration + inputPaths []string + deployType string + forceDeploy bool + ensureNamespace bool + timeout time.Duration + dryRun bool + wait bool + filteredJobAnnotation string + filteredJobMaxRetries int + filteredJobTimeout time.Duration clientFactory util.ClientFactory clock clock.PassiveClock @@ -202,9 +202,9 @@ func (f *Flags) AddFlags(flags *pflag.FlagSet) { flags.DurationVar(&f.timeout, timeoutFlagName, timeoutDefaultValue, timeoutFlagUsage) flags.BoolVar(&f.dryRun, dryRunFlagName, dryRunDefaultValue, dryRunFlagUsage) flags.BoolVar(&f.wait, waitFlagName, waitFlagDefaultValue, waitFlagUsage) - flags.StringVar(&f.preDeployJobAnnotation, preDeployJobAnnotationFlagName, "", preDeployJobAnnotationFlagUsage) - flags.IntVar(&f.preDeployJobMaxRetries, preDeployJobMaxRetriesFlagName, preDeployJobMaxRetriesDefaultValue, preDeployJobMaxRetriesFlagUsage) - flags.DurationVar(&f.preDeployJobTimeout, preDeployJobTimeoutFlagName, preDeployJobTimeoutDefaultValue, preDeployJobTimeoutFlagUsage) + flags.StringVar(&f.filteredJobAnnotation, filteredJobAnnotationFlagName, "", filteredJobAnnotationFlagUsage) + flags.IntVar(&f.filteredJobMaxRetries, filteredJobMaxRetriesFlagName, filteredJobMaxRetriesDefaultValue, filteredJobMaxRetriesFlagUsage) + flags.DurationVar(&f.filteredJobTimeout, filteredJobTimeoutFlagName, filteredJobTimeoutDefaultValue, filteredJobTimeoutFlagUsage) } // ToOptions transform the command flags in command runtime arguments @@ -214,15 +214,15 @@ func (f *Flags) ToOptions(reader io.Reader, writer io.Writer) (*Options, error) } return &Options{ - inputPaths: f.inputPaths, - deployType: f.deployType, - forceDeploy: f.forceDeploy, - ensureNamespace: f.ensureNamespace, - timeout: f.timeout, - wait: f.wait, - preDeployJobAnnotation: f.preDeployJobAnnotation, - preDeployJobMaxRetries: f.preDeployJobMaxRetries, - preDeployJobTimeout: f.preDeployJobTimeout, + inputPaths: f.inputPaths, + deployType: f.deployType, + forceDeploy: f.forceDeploy, + ensureNamespace: f.ensureNamespace, + timeout: f.timeout, + wait: f.wait, + filteredJobAnnotation: f.filteredJobAnnotation, + filteredJobMaxRetries: f.filteredJobMaxRetries, + filteredJobTimeout: f.filteredJobTimeout, clientFactory: util.NewFactory(f.ConfigFlags), reader: reader, @@ -271,12 +271,12 @@ func (o *Options) Run(ctx context.Context) error { } var stop bool - resources, stop, err = o.runPreDeployPhase(ctx, namespace, resources) + resources, stop, err = o.runFilteredJobPhase(ctx, namespace, resources) if err != nil { return err } if stop { - logger.V(3).Info("pre-deploy jobs executed, skipping remaining resources apply") + logger.V(3).Info("filtered jobs executed, skipping remaining resources apply") return nil } @@ -344,19 +344,19 @@ func formatApplyErrors(errs []error) string { return builder.String() } -// runPreDeployPhase handles pre-deploy job filtering and execution. When the pre-deploy -// annotation flag is set, matching jobs are extracted and executed; if any are found the +// runFilteredJobPhase handles filtered job execution. When the --filtered-job-annotation +// flag is set, matching jobs are extracted and executed; if any are found the // caller should stop the normal apply phase (stop=true). If the flag is set but no jobs // match the filter, nothing is applied (stop=true with empty resources). When the flag is // absent, annotated jobs are stripped from the resource list so they are never applied as // regular resources. -func (o *Options) runPreDeployPhase(ctx context.Context, namespace string, resources []*unstructured.Unstructured) ([]*unstructured.Unstructured, bool, error) { - if o.preDeployJobAnnotation == "" { +func (o *Options) runFilteredJobPhase(ctx context.Context, namespace string, resources []*unstructured.Unstructured) ([]*unstructured.Unstructured, bool, error) { + if o.filteredJobAnnotation == "" { return StripAnnotatedJobs(resources), false, nil } - preDeployJobs, remaining := FilterPreDeployJobs(resources, o.preDeployJobAnnotation) - if len(preDeployJobs) == 0 { + filteredJobs, remaining := FilterAnnotatedJobs(resources, o.filteredJobAnnotation) + if len(filteredJobs) == 0 { return nil, true, nil } @@ -365,8 +365,8 @@ func (o *Options) runPreDeployPhase(ctx context.Context, namespace string, resou return nil, false, err } - runner := NewPreDeployJobRunner(clientSet, namespace, o.preDeployJobMaxRetries, o.preDeployJobTimeout, o.writer, o.dryRun) - if err := runner.Run(ctx, preDeployJobs); err != nil { + runner := NewFilteredJobRunner(clientSet, namespace, o.filteredJobMaxRetries, o.filteredJobTimeout, o.writer, o.dryRun) + if err := runner.Run(ctx, filteredJobs); err != nil { return nil, false, err } diff --git a/pkg/cmd/deploy/deploy_test.go b/pkg/cmd/deploy/deploy_test.go index e0fa7e1..b33e17d 100644 --- a/pkg/cmd/deploy/deploy_test.go +++ b/pkg/cmd/deploy/deploy_test.go @@ -95,24 +95,24 @@ func TestOptions(t *testing.T) { configFlags := genericclioptions.NewConfigFlags(false) expectedOpts := &Options{ - inputPaths: []string{"input"}, - deployType: "smart_deploy", - preDeployJobTimeout: 5 * time.Minute, - preDeployJobMaxRetries: 3, - preDeployJobAnnotation: "pre-deploy", - reader: reader, - writer: buffer, - clientFactory: util.NewFactory(configFlags), - clock: clock.RealClock{}, - wait: false, + inputPaths: []string{"input"}, + deployType: "smart_deploy", + filteredJobTimeout: 5 * time.Minute, + filteredJobMaxRetries: 3, + filteredJobAnnotation: "pre-deploy", + reader: reader, + writer: buffer, + clientFactory: util.NewFactory(configFlags), + clock: clock.RealClock{}, + wait: false, } flag := &Flags{ - inputPaths: []string{"input"}, - deployType: "smart_deploy", - preDeployJobTimeout: 5 * time.Minute, - preDeployJobMaxRetries: 3, - preDeployJobAnnotation: "pre-deploy", + inputPaths: []string{"input"}, + deployType: "smart_deploy", + filteredJobTimeout: 5 * time.Minute, + filteredJobMaxRetries: 3, + filteredJobAnnotation: "pre-deploy", } _, err := flag.ToOptions(reader, buffer) assert.ErrorContains(t, err, "config flags are required") diff --git a/pkg/cmd/deploy/predeployjob.go b/pkg/cmd/deploy/filteredjob.go similarity index 65% rename from pkg/cmd/deploy/predeployjob.go rename to pkg/cmd/deploy/filteredjob.go index cbcac9b..5ffa7d6 100644 --- a/pkg/cmd/deploy/predeployjob.go +++ b/pkg/cmd/deploy/filteredjob.go @@ -31,14 +31,14 @@ import ( ) const ( - preDeployAnnotation = "mia-platform.eu/deploy" + filteredJobAnnotationKey = "mia-platform.eu/deploy" deployOptionalAnnotation = "mia-platform.eu/deploy-optional" jobKind = "Job" ) -// PreDeployJobRunner handles the creation, execution and monitoring of pre-deploy jobs +// FilteredJobRunner handles the creation, execution and monitoring of filtered jobs // with configurable retry and timeout support. -type PreDeployJobRunner struct { +type FilteredJobRunner struct { clientSet kubernetes.Interface namespace string maxRetries int @@ -48,10 +48,10 @@ type PreDeployJobRunner struct { dryRun bool } -// NewPreDeployJobRunner creates a new PreDeployJobRunner configured with the specified parameters -// for running pre-deploy jobs against the target cluster. -func NewPreDeployJobRunner(clientSet kubernetes.Interface, namespace string, maxRetries int, timeout time.Duration, writer io.Writer, dryRun bool) *PreDeployJobRunner { - return &PreDeployJobRunner{ +// NewFilteredJobRunner creates a new FilteredJobRunner configured with the specified parameters +// for running filtered jobs against the target cluster. +func NewFilteredJobRunner(clientSet kubernetes.Interface, namespace string, maxRetries int, timeout time.Duration, writer io.Writer, dryRun bool) *FilteredJobRunner { + return &FilteredJobRunner{ clientSet: clientSet, namespace: namespace, maxRetries: maxRetries, @@ -62,36 +62,36 @@ func NewPreDeployJobRunner(clientSet kubernetes.Interface, namespace string, max } } -// FilterPreDeployJobs separates pre-deploy jobs from the remaining resources based on the +// FilterAnnotatedJobs separates filtered jobs from the remaining resources based on the // mia-platform.eu/deploy annotation matching the provided annotation value. -// It returns two slices: the matching pre-deploy jobs and the remaining resources. -func FilterPreDeployJobs(resources []*unstructured.Unstructured, annotationValue string) ([]*unstructured.Unstructured, []*unstructured.Unstructured) { - var preDeployJobs []*unstructured.Unstructured +// It returns two slices: the matching filtered jobs and the remaining resources. +func FilterAnnotatedJobs(resources []*unstructured.Unstructured, annotationValue string) ([]*unstructured.Unstructured, []*unstructured.Unstructured) { + var filteredJobs []*unstructured.Unstructured var remainingResources []*unstructured.Unstructured for _, res := range resources { if res.GetKind() == jobKind { annotations := res.GetAnnotations() - if val, ok := annotations[preDeployAnnotation]; ok && val == annotationValue { - preDeployJobs = append(preDeployJobs, res) + if val, ok := annotations[filteredJobAnnotationKey]; ok && val == annotationValue { + filteredJobs = append(filteredJobs, res) continue } } remainingResources = append(remainingResources, res) } - return preDeployJobs, remainingResources + return filteredJobs, remainingResources } // StripAnnotatedJobs removes all Job resources that carry the mia-platform.eu/deploy -// annotation, regardless of its value. This is used to exclude pre-deploy jobs from the -// normal apply flow when the pre-deploy-job-annotation flag is not provided. +// annotation, regardless of its value. This is used to exclude filtered jobs from the +// normal apply flow when the --filtered-job-annotation flag is not provided. func StripAnnotatedJobs(resources []*unstructured.Unstructured) []*unstructured.Unstructured { var remaining []*unstructured.Unstructured for _, res := range resources { if res.GetKind() == jobKind { - if _, ok := res.GetAnnotations()[preDeployAnnotation]; ok { + if _, ok := res.GetAnnotations()[filteredJobAnnotationKey]; ok { continue } } @@ -101,65 +101,65 @@ func StripAnnotatedJobs(resources []*unstructured.Unstructured) []*unstructured. return remaining } -// isOptionalPreDeployJob reports whether the job carries the deploy-optional annotation set to "true". -func isOptionalPreDeployJob(job *unstructured.Unstructured) bool { +// isOptionalFilteredJob reports whether the job carries the deploy-optional annotation set to "true". +func isOptionalFilteredJob(job *unstructured.Unstructured) bool { return job.GetAnnotations()[deployOptionalAnnotation] == "true" } -// Run executes all pre-deploy jobs with retry and timeout support. Each job is retried +// Run executes all filtered jobs with retry and timeout support. Each job is retried // up to maxRetries times upon failure. Jobs annotated with mia-platform.eu/deploy-optional=true // are treated as non-blocking: their failure is logged as a warning but never counted as an // error. For mandatory jobs an error is returned only if ALL of them fail; if at least one // mandatory job succeeds, the deploy process can continue. -func (r *PreDeployJobRunner) Run(ctx context.Context, jobs []*unstructured.Unstructured) error { +func (r *FilteredJobRunner) Run(ctx context.Context, jobs []*unstructured.Unstructured) error { logger := logr.FromContextOrDiscard(ctx) if len(jobs) == 0 { - logger.V(3).Info("no pre-deploy jobs to run") + logger.V(3).Info("no filtered jobs to run") return nil } if r.dryRun { for _, job := range jobs { - fmt.Fprintf(r.writer, "pre-deploy job %q would be executed (dry-run)\n", job.GetName()) + fmt.Fprintf(r.writer, "filtered job %q would be executed (dry-run)\n", job.GetName()) } return nil } - logger.V(3).Info("starting pre-deploy jobs", "count", len(jobs)) + logger.V(3).Info("starting filtered jobs", "count", len(jobs)) for _, job := range jobs { jobName := job.GetName() - optional := isOptionalPreDeployJob(job) + optional := isOptionalFilteredJob(job) err := r.runJobWithRetries(ctx, job) if err != nil { if optional { - logger.V(3).Info("optional pre-deploy job failed, continuing", "name", jobName, "error", err) - fmt.Fprintf(r.writer, "optional pre-deploy job %q failed, continuing\n", jobName) + logger.V(3).Info("optional filtered job failed, continuing", "name", jobName, "error", err) + fmt.Fprintf(r.writer, "optional filtered job %q failed, continuing\n", jobName) } else { - logger.V(3).Info("pre-deploy job failed", "name", jobName, "error", err) - return fmt.Errorf("pre-deploy job %q failed after %d attempts: %w", jobName, r.maxRetries, err) + logger.V(3).Info("filtered job failed", "name", jobName, "error", err) + return fmt.Errorf("filtered job %q failed after %d attempts: %w", jobName, r.maxRetries, err) } } else { - fmt.Fprintf(r.writer, "pre-deploy job %q completed successfully\n", jobName) + fmt.Fprintf(r.writer, "filtered job %q completed successfully\n", jobName) } } - logger.V(3).Info("pre-deploy jobs completed") + logger.V(3).Info("filtered jobs completed") return nil } -// runJobWithRetries attempts to run a single pre-deploy job, retrying up to maxRetries times +// runJobWithRetries attempts to run a single filtered job, retrying up to maxRetries times // on failure. The failed job is deleted before each retry attempt. -func (r *PreDeployJobRunner) runJobWithRetries(ctx context.Context, jobUnstr *unstructured.Unstructured) error { +func (r *FilteredJobRunner) runJobWithRetries(ctx context.Context, jobUnstr *unstructured.Unstructured) error { logger := logr.FromContextOrDiscard(ctx) var lastErr error for attempt := 0; attempt <= r.maxRetries; attempt++ { if attempt > 0 { - logger.V(3).Info("retrying pre-deploy job", "name", jobUnstr.GetName(), "attempt", attempt) - fmt.Fprintf(r.writer, "retrying pre-deploy job %q (attempt %d/%d)\n", jobUnstr.GetName(), attempt, r.maxRetries) + logger.V(3).Info("retrying filtered job", "name", jobUnstr.GetName(), "attempt", attempt) + fmt.Fprintf(r.writer, "retrying filtered job %q (attempt %d/%d)\n", jobUnstr.GetName(), attempt, r.maxRetries) } lastErr = r.createAndWaitForJob(ctx, jobUnstr) @@ -167,7 +167,7 @@ func (r *PreDeployJobRunner) runJobWithRetries(ctx context.Context, jobUnstr *un return nil } - logger.V(5).Info("pre-deploy job attempt failed", "name", jobUnstr.GetName(), "attempt", attempt, "error", lastErr) + logger.V(5).Info("filtered job attempt failed", "name", jobUnstr.GetName(), "attempt", attempt, "error", lastErr) // Clean up the failed job before retrying if cleanErr := r.deleteJob(ctx, jobUnstr.GetName()); cleanErr != nil { @@ -181,7 +181,7 @@ func (r *PreDeployJobRunner) runJobWithRetries(ctx context.Context, jobUnstr *un // createAndWaitForJob creates a single job in the cluster and waits for its completion. // It converts the unstructured resource to a typed Job, sets the namespace, and monitors // the job status until completion, failure, or timeout. -func (r *PreDeployJobRunner) createAndWaitForJob(ctx context.Context, jobUnstr *unstructured.Unstructured) error { +func (r *FilteredJobRunner) createAndWaitForJob(ctx context.Context, jobUnstr *unstructured.Unstructured) error { logger := logr.FromContextOrDiscard(ctx) var job batchv1.Job @@ -194,7 +194,7 @@ func (r *PreDeployJobRunner) createAndWaitForJob(ctx context.Context, jobUnstr * job.ResourceVersion = "" job.Status = batchv1.JobStatus{} - logger.V(5).Info("creating pre-deploy job", "name", job.Name, "namespace", r.namespace) + logger.V(5).Info("creating filtered job", "name", job.Name, "namespace", r.namespace) if _, err := r.clientSet.BatchV1().Jobs(r.namespace).Create(ctx, &job, metav1.CreateOptions{}); err != nil { return fmt.Errorf("failed to create job %q: %w", job.Name, err) @@ -205,7 +205,7 @@ func (r *PreDeployJobRunner) createAndWaitForJob(ctx context.Context, jobUnstr * // waitForJobCompletion polls the job status at regular intervals until the job completes, // fails, or the configured timeout expires. -func (r *PreDeployJobRunner) waitForJobCompletion(ctx context.Context, name string) error { +func (r *FilteredJobRunner) waitForJobCompletion(ctx context.Context, name string) error { logger := logr.FromContextOrDiscard(ctx) timeoutCtx, cancel := context.WithTimeout(ctx, r.timeout) @@ -242,7 +242,7 @@ func (r *PreDeployJobRunner) waitForJobCompletion(ctx context.Context, name stri } // deleteJob removes a job and its associated pods from the cluster using background propagation -func (r *PreDeployJobRunner) deleteJob(ctx context.Context, name string) error { +func (r *FilteredJobRunner) deleteJob(ctx context.Context, name string) error { propagation := metav1.DeletePropagationBackground return r.clientSet.BatchV1().Jobs(r.namespace).Delete(ctx, name, metav1.DeleteOptions{ PropagationPolicy: &propagation, diff --git a/pkg/cmd/deploy/predeployjob_test.go b/pkg/cmd/deploy/filteredjob_test.go similarity index 94% rename from pkg/cmd/deploy/predeployjob_test.go rename to pkg/cmd/deploy/filteredjob_test.go index e3e16cc..c8c371c 100644 --- a/pkg/cmd/deploy/predeployjob_test.go +++ b/pkg/cmd/deploy/filteredjob_test.go @@ -33,7 +33,7 @@ import ( ) // newTestJob creates an unstructured Job resource for testing with the given name and optional -// pre-deploy annotation value. If annotationValue is empty, no annotation is added. +// filtered job annotation value. If annotationValue is empty, no annotation is added. func newTestJob(name, annotationValue string) *unstructured.Unstructured { obj := &unstructured.Unstructured{ Object: map[string]interface{}{ @@ -60,14 +60,14 @@ func newTestJob(name, annotationValue string) *unstructured.Unstructured { if annotationValue != "" { obj.SetAnnotations(map[string]string{ - preDeployAnnotation: annotationValue, + filteredJobAnnotationKey: annotationValue, }) } return obj } -// newOptionalTestJob creates a pre-deploy Job with the deploy-optional annotation set to "true". +// newOptionalTestJob creates a filtered Job with the deploy-optional annotation set to "true". func newOptionalTestJob(name string) *unstructured.Unstructured { job := newTestJob(name, "pre-deploy") annotations := job.GetAnnotations() @@ -76,7 +76,7 @@ func newOptionalTestJob(name string) *unstructured.Unstructured { return job } -func TestFilterPreDeployJobs(t *testing.T) { +func TestFilterAnnotatedJobs(t *testing.T) { t.Parallel() tests := map[string]struct { @@ -130,7 +130,7 @@ func TestFilterPreDeployJobs(t *testing.T) { "kind": "ConfigMap", "metadata": map[string]interface{}{ "name": "cm-with-annotation", - "annotations": map[string]interface{}{preDeployAnnotation: "pre-deploy"}, + "annotations": map[string]interface{}{filteredJobAnnotationKey: "pre-deploy"}, }, }, }, @@ -172,7 +172,7 @@ func TestFilterPreDeployJobs(t *testing.T) { t.Run(name, func(t *testing.T) { t.Parallel() - jobs, remaining := FilterPreDeployJobs(test.resources, test.annotationValue) + jobs, remaining := FilterAnnotatedJobs(test.resources, test.annotationValue) assert.Len(t, jobs, test.expectedJobs) assert.Len(t, remaining, test.expectedRemaining) }) @@ -226,7 +226,7 @@ func TestStripAnnotatedJobs(t *testing.T) { "kind": "ConfigMap", "metadata": map[string]interface{}{ "name": "cm-annotated", - "annotations": map[string]interface{}{preDeployAnnotation: "pre-deploy"}, + "annotations": map[string]interface{}{filteredJobAnnotationKey: "pre-deploy"}, }, }, }, @@ -252,7 +252,7 @@ func TestStripAnnotatedJobs(t *testing.T) { } } -func TestPreDeployJobRunnerRun(t *testing.T) { +func TestFilteredJobRunnerRun(t *testing.T) { t.Parallel() namespace := "test-ns" @@ -320,7 +320,7 @@ func TestPreDeployJobRunnerRun(t *testing.T) { }) }, maxRetries: 1, - expectedError: "pre-deploy job \"fail-job\" failed after 1 attempts:", + expectedError: "filtered job \"fail-job\" failed after 1 attempts:", }, "partial success stops at first failure": { jobs: []*unstructured.Unstructured{ @@ -353,7 +353,7 @@ func TestPreDeployJobRunnerRun(t *testing.T) { }, maxRetries: 0, expectedOut: "completed successfully", - expectedError: "pre-deploy job \"fail-job\" failed after 0 attempts:", + expectedError: "filtered job \"fail-job\" failed after 0 attempts:", }, "multiple jobs stops at first failure": { jobs: []*unstructured.Unstructured{ @@ -376,7 +376,7 @@ func TestPreDeployJobRunnerRun(t *testing.T) { }) }, maxRetries: 0, - expectedError: "pre-deploy job \"fail-job-1\" failed after 0 attempts:", + expectedError: "filtered job \"fail-job-1\" failed after 0 attempts:", }, "optional job failure does not block deploy": { jobs: []*unstructured.Unstructured{newOptionalTestJob("optional-job")}, @@ -396,7 +396,7 @@ func TestPreDeployJobRunnerRun(t *testing.T) { }) }, maxRetries: 0, - expectedOut: "optional pre-deploy job", + expectedOut: "optional filtered job", }, "optional job failure with mandatory success": { jobs: []*unstructured.Unstructured{ @@ -458,7 +458,7 @@ func TestPreDeployJobRunnerRun(t *testing.T) { }) }, maxRetries: 0, - expectedError: "pre-deploy job \"mandatory-job\" failed after 0 attempts:", + expectedError: "filtered job \"mandatory-job\" failed after 0 attempts:", }, "all optional jobs fail": { jobs: []*unstructured.Unstructured{ @@ -481,7 +481,7 @@ func TestPreDeployJobRunnerRun(t *testing.T) { }) }, maxRetries: 0, - expectedOut: "optional pre-deploy job", + expectedOut: "optional filtered job", }, } @@ -495,7 +495,7 @@ func TestPreDeployJobRunnerRun(t *testing.T) { } writer := new(strings.Builder) - runner := NewPreDeployJobRunner(fakeClientset, namespace, test.maxRetries, 5*time.Second, writer, test.dryRun) + runner := NewFilteredJobRunner(fakeClientset, namespace, test.maxRetries, 5*time.Second, writer, test.dryRun) runner.pollInterval = 10 * time.Millisecond err := runner.Run(t.Context(), test.jobs) @@ -594,7 +594,7 @@ func TestRunJobWithRetries(t *testing.T) { test.setupReactor(fakeClientset) writer := new(strings.Builder) - runner := NewPreDeployJobRunner(fakeClientset, namespace, test.maxRetries, 5*time.Second, writer, false) + runner := NewFilteredJobRunner(fakeClientset, namespace, test.maxRetries, 5*time.Second, writer, false) runner.pollInterval = 10 * time.Millisecond job := newTestJob("test-job", "pre-deploy") @@ -701,7 +701,7 @@ func TestWaitForJobCompletion(t *testing.T) { test.setupReactor(fakeClientset) writer := new(strings.Builder) - runner := NewPreDeployJobRunner(fakeClientset, namespace, 3, test.timeout, writer, false) + runner := NewFilteredJobRunner(fakeClientset, namespace, 3, test.timeout, writer, false) runner.pollInterval = 10 * time.Millisecond err := runner.waitForJobCompletion(t.Context(), "test-job") @@ -760,7 +760,7 @@ func TestCreateAndWaitForJob(t *testing.T) { test.setupReactor(fakeClientset) writer := new(strings.Builder) - runner := NewPreDeployJobRunner(fakeClientset, namespace, 3, 5*time.Second, writer, false) + runner := NewFilteredJobRunner(fakeClientset, namespace, 3, 5*time.Second, writer, false) runner.pollInterval = 10 * time.Millisecond err := runner.createAndWaitForJob(t.Context(), test.job) @@ -787,7 +787,7 @@ func TestDeleteJob(t *testing.T) { }) writer := new(strings.Builder) - runner := NewPreDeployJobRunner(fakeClientset, namespace, 3, 30*time.Second, writer, false) + runner := NewFilteredJobRunner(fakeClientset, namespace, 3, 30*time.Second, writer, false) err := runner.deleteJob(t.Context(), "test-job") require.NoError(t, err) diff --git a/pkg/cmd/deploy/testdata/pre-deploy/configmap.yaml b/pkg/cmd/deploy/testdata/filtered-job/configmap.yaml similarity index 100% rename from pkg/cmd/deploy/testdata/pre-deploy/configmap.yaml rename to pkg/cmd/deploy/testdata/filtered-job/configmap.yaml diff --git a/pkg/cmd/deploy/testdata/pre-deploy/deployment.yaml b/pkg/cmd/deploy/testdata/filtered-job/deployment.yaml similarity index 100% rename from pkg/cmd/deploy/testdata/pre-deploy/deployment.yaml rename to pkg/cmd/deploy/testdata/filtered-job/deployment.yaml diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-failed.yaml b/pkg/cmd/deploy/testdata/filtered-job/filtered-job-failed.yaml similarity index 92% rename from pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-failed.yaml rename to pkg/cmd/deploy/testdata/filtered-job/filtered-job-failed.yaml index c4e9e08..e685000 100644 --- a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-failed.yaml +++ b/pkg/cmd/deploy/testdata/filtered-job/filtered-job-failed.yaml @@ -1,7 +1,7 @@ apiVersion: batch/v1 kind: Job metadata: - name: pre-deploy-migration-fail + name: filtered-migration-fail annotations: mia-platform.eu/deploy: pre-deploy spec: diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml b/pkg/cmd/deploy/testdata/filtered-job/filtered-job-optional-failed.yaml similarity index 92% rename from pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml rename to pkg/cmd/deploy/testdata/filtered-job/filtered-job-optional-failed.yaml index f5f271d..830dc50 100644 --- a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional-failed.yaml +++ b/pkg/cmd/deploy/testdata/filtered-job/filtered-job-optional-failed.yaml @@ -1,7 +1,7 @@ apiVersion: batch/v1 kind: Job metadata: - name: pre-deploy-migration-fail + name: filtered-migration-fail annotations: mia-platform.eu/deploy: pre-deploy mia-platform.eu/deploy-optional: 'true' diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml b/pkg/cmd/deploy/testdata/filtered-job/filtered-job-optional.yaml similarity index 93% rename from pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml rename to pkg/cmd/deploy/testdata/filtered-job/filtered-job-optional.yaml index 379d965..41d462a 100644 --- a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-optional.yaml +++ b/pkg/cmd/deploy/testdata/filtered-job/filtered-job-optional.yaml @@ -1,7 +1,7 @@ apiVersion: batch/v1 kind: Job metadata: - name: pre-deploy-optional + name: filtered-optional annotations: mia-platform.eu/deploy: pre-deploy mia-platform.eu/deploy-optional: 'true' diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-sleep.yaml b/pkg/cmd/deploy/testdata/filtered-job/filtered-job-sleep.yaml similarity index 93% rename from pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-sleep.yaml rename to pkg/cmd/deploy/testdata/filtered-job/filtered-job-sleep.yaml index d617225..c88f33e 100644 --- a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job-sleep.yaml +++ b/pkg/cmd/deploy/testdata/filtered-job/filtered-job-sleep.yaml @@ -1,7 +1,7 @@ apiVersion: batch/v1 kind: Job metadata: - name: pre-deploy-sleep + name: filtered-sleep annotations: mia-platform.eu/deploy: pre-deploy spec: diff --git a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job.yaml b/pkg/cmd/deploy/testdata/filtered-job/filtered-job.yaml similarity index 91% rename from pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job.yaml rename to pkg/cmd/deploy/testdata/filtered-job/filtered-job.yaml index ad8035f..6a6b008 100644 --- a/pkg/cmd/deploy/testdata/pre-deploy/pre-deploy-job.yaml +++ b/pkg/cmd/deploy/testdata/filtered-job/filtered-job.yaml @@ -1,7 +1,7 @@ apiVersion: batch/v1 kind: Job metadata: - name: pre-deploy-migration + name: filtered-migration annotations: mia-platform.eu/deploy: pre-deploy spec: diff --git a/pkg/extensions/testdata/filter/pre-deploy-job.yaml b/pkg/extensions/testdata/filter/filtered-job.yaml similarity index 91% rename from pkg/extensions/testdata/filter/pre-deploy-job.yaml rename to pkg/extensions/testdata/filter/filtered-job.yaml index f7fdc5f..6b2601f 100644 --- a/pkg/extensions/testdata/filter/pre-deploy-job.yaml +++ b/pkg/extensions/testdata/filter/filtered-job.yaml @@ -1,7 +1,7 @@ apiVersion: batch/v1 kind: Job metadata: - name: pre-deploy-migration + name: filtered-migration annotations: mia-platform.eu/deploy: pre-deploy spec: