feat: add Kubernetes resource limits for activation job containers#1534
feat: add Kubernetes resource limits for activation job containers#1534amasolov wants to merge 15 commits intoansible:mainfrom
Conversation
Add k8s_pod_service_account_name, k8s_pod_labels, and k8s_pod_annotations on Activation; apply to rulebook activation Job pod templates on Kubernetes. Validate labels and annotations; reserve EDA selector labels app and job-name. Includes migration, API serializers, analytics export, and tests. Made-with: Cursor
- Trim ServiceAccount and label/annotation strings before persist - Stricter qualified-name validation for K8s keys (prefix/name, lengths) - Add integration assert for job-name label; expand unit boundary tests - Add validator docstrings for CodeRabbit docstring coverage Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
- Extract _activation_k8s_pod_metadata_payload helper to replace five identical pod-metadata blocks in serializers (duplication 9% -> ~2%). - Add comprehensive tests for validators.py DEPLOYMENT_TYPE gating, k8s_pod_metadata.py qualified-key edge cases, _normalize helper (trim, None coercion, empty-key rejection), and the payload helper. Made-with: Cursor
Move repeated serializer field declarations for k8s_pod_service_account_name, k8s_pod_labels, and k8s_pod_annotations into _K8sPodMetadataReadFields and _K8sPodMetadataWriteFields mixins. Five serializer classes now inherit them instead of declaring them inline, targeting SonarCloud duplication < 3%. Made-with: Cursor
- _normalize_activation_k8s_pod_fields: extract _trim_string_dict helper to reduce cognitive complexity from 22 to ~8 (S3776, threshold 15). - _validate_qualified_metadata_key: extract _validate_prefixed_key and _is_valid_dns_subdomain helpers to reduce complexity from 18 to ~6. - Replace _DNS_SUBDOMAIN regex with per-label matching via _DNS_LABEL to eliminate backtracking risk flagged as security hotspot (S5852). Made-with: Cursor
Made-with: Cursor
- New `k8s_pod_node_selector` JSONField on Activation model (migration 0071) allows users to schedule activation job pods onto specific nodes via Kubernetes nodeSelector. Addresses AAPRFE-2840. - New `ALLOWED_SERVICE_ACCOUNTS` setting (env: EDA_ALLOWED_SERVICE_ACCOUNTS) restricts which ServiceAccount names can be specified on activations, preventing privilege escalation via arbitrary SA names. - Kubernetes engine applies nodeSelector to the pod template spec. - Podman deployments: all k8s_pod_* fields are silently ignored (validators skip when DEPLOYMENT_TYPE != "k8s", podman engine never reads them). - Serializers, analytics export, and tests updated for both features. Made-with: Cursor
Tests that patch aap_eda.core.validators.settings use a MagicMock, so ALLOWED_SERVICE_ACCOUNTS resolves to a truthy MagicMock object instead of the default empty list. Explicitly set it to [] in tests that are not exercising the allowlist logic. Signed-off-by: Artem Masolov <amasolov@redhat.com> Made-with: Cursor
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 24 minutes and 48 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (7)
📝 WalkthroughWalkthroughThis PR adds Kubernetes pod metadata support to activation job containers by introducing four new optional fields (service account name, labels, annotations, node selector) to the Activation model, creating validators for their structure and content, updating serializers to accept and return them, modifying the Kubernetes engine to apply them to Job pod templates, and introducing new settings for CPU/memory resource limits and allowed service accounts. Changes
Sequence DiagramsequenceDiagram
actor User
participant API
participant Serializer
participant Validator
participant ActivationModel
participant KubeEngine
participant KubeCluster
User->>API: POST/PUT activation with pod metadata
API->>Serializer: Deserialize request
Serializer->>Validator: validate_k8s_pod_labels/annotations/sa/node_selector
Validator->>Validator: normalize (trim, convert None to {})
Validator-->>Serializer: Validation result
Serializer->>ActivationModel: Create/update activation with pod metadata fields
ActivationModel-->>Serializer: Activation instance
Serializer-->>API: Serialized response with metadata fields
Note over User,API: Later, when activation starts...
API->>KubeEngine: start(activation)
KubeEngine->>KubeEngine: Build ContainerRequest (extract pod metadata)
KubeEngine->>KubeEngine: Construct V1Container with resource limits
KubeEngine->>KubeEngine: Merge pod labels, set annotations, node_selector
KubeEngine->>KubeEngine: Set service_account_name on pod spec
KubeEngine->>KubeCluster: create_namespaced_job(job_manifest)
KubeCluster-->>KubeEngine: Job created with configured metadata
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
tests/integration/services/activation/engine/test_kubernetes.py (1)
329-366: Cover memory-only and CPU-only limit configurations.The requirement says limits apply when either setting is configured, but this test only covers both set. Add parametrized cases for
("200Mi", None)and(None, "500m")so an accidentalandcondition in the engine cannot slip through.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/services/activation/engine/test_kubernetes.py` around lines 329 - 366, The test test_engine_start_applies_k8s_resource_limits only verifies when both k8s_mem_limit and k8s_cpu_limit are set; add parametrized cases to also test memory-only and CPU-only configurations to catch accidental use of an AND condition. Modify the test (test_engine_start_applies_k8s_resource_limits) to parametrize inputs like (k8s_mem_limit="200Mi", k8s_cpu_limit=None) and (k8s_mem_limit=None, k8s_cpu_limit="500m"), call engine.start(request, log_handler) for each case using the same capture_job mechanism, then assert on created_body.spec.template.spec.containers[0].resources.limits that the present limit exists and the absent limit is not set; keep the existing capture_job, kubernetes_engine fixture and assertions structure.src/aap_eda/api/serializers/activation.py (1)
154-219: Consider allow_null=True for pod metadata fields to enable explicit null handling.The pod metadata fields (
k8s_pod_labels,k8s_pod_annotations,k8s_pod_node_selector) currently reject explicitnullvalues becauseallow_nulldefaults toFalse. While the PATCH update path is safe due torefill_needed_data()restoring omitted fields before validation, allowing explicitnullwould improve API clarity. Currently, clients cannot distinguish "omit this field" from "clear this field" in requests.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aap_eda/api/serializers/activation.py` around lines 154 - 219, Update the three JSONField definitions (k8s_pod_labels, k8s_pod_annotations, k8s_pod_node_selector) to allow explicit nulls by adding allow_null=True, and modify _normalize_activation_k8s_pod_fields so it preserves an explicit None value sent by the client instead of coercing it to {}: use presence-checks (e.g., if field in data) to distinguish "field omitted" from "field explicitly null", only set data[field] = {} when the field is not present, and continue trimming when the value is a dict via _trim_string_dict; keep handling of k8s_pod_service_account_name unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/aap_eda/analytics/analytics_collectors.py`:
- Around line 217-220: Remove the raw Kubernetes pod metadata keys from the
analytics export by deleting the entries "k8s_pod_service_account_name",
"k8s_pod_labels", "k8s_pod_annotations", and "k8s_pod_node_selector" from the
list/collection in src/aap_eda/analytics/analytics_collectors.py where they are
defined (search for those exact string literals to find the variable holding
exported fields). Ensure any downstream consumers (e.g., functions or tests
referencing those keys) are updated to not expect them and add a comment near
the list explaining we intentionally omit operator-supplied pod metadata for
privacy/telemetry policy reasons.
In `@src/aap_eda/core/utils/k8s_pod_metadata.py`:
- Line 34: The code currently uses _ANNOTATION_VALUE_MAX_LEN (256 * 1024) to
validate each annotation value independently; change this to validate the
combined UTF-8 byte length of all annotation keys and values instead: rename or
add a constant like _ANNOTATION_TOTAL_MAX_BYTES = 256 * 1024, compute total =
sum(len(k.encode('utf-8')) + len(v.encode('utf-8')) for k,v in
annotations.items()) in the function that validates/sets annotations (look for
places referencing _ANNOTATION_VALUE_MAX_LEN), and raise the same
error/ValueError if total > _ANNOTATION_TOTAL_MAX_BYTES while updating the error
message to mention total byte size rather than per-value size.
- Around line 72-90: Remove the blanket 253-character cap in
_validate_qualified_metadata_key: do not reject key based on total length before
checking for a "/"—instead, if key contains "/", split and call
_validate_prefixed_key(key, prefix, name, field_label=field_label) which should
enforce prefix <= 253 and name <= 63; if there is no "/", continue to enforce
the 63-char limit and _NAME_SEGMENT match for the single name. In short: delete
the initial len(key) > 253 check and rely on _validate_prefixed_key for prefixed
keys while preserving the existing non-prefixed (<=63 and _NAME_SEGMENT) checks.
In `@src/aap_eda/core/validators.py`:
- Around line 371-386: The check_if_k8s_pod_node_selector_valid function
currently allows any strings; update it to import and call the qualified-name
and label-value validator functions from k8s_pod_metadata.py for each
nodeSelector key and value (while keeping the early returns for None/{} and
settings.DEPLOYMENT_TYPE != "k8s"), raising serializers.ValidationError on any
validation failure; ensure you do not apply the reserved pod-label key
restriction from k8s_pod_metadata.py (i.e., only reuse the qualified-name
validator for keys and the label-value validator for values) so invalid
selectors are rejected early during activation validation.
---
Nitpick comments:
In `@src/aap_eda/api/serializers/activation.py`:
- Around line 154-219: Update the three JSONField definitions (k8s_pod_labels,
k8s_pod_annotations, k8s_pod_node_selector) to allow explicit nulls by adding
allow_null=True, and modify _normalize_activation_k8s_pod_fields so it preserves
an explicit None value sent by the client instead of coercing it to {}: use
presence-checks (e.g., if field in data) to distinguish "field omitted" from
"field explicitly null", only set data[field] = {} when the field is not
present, and continue trimming when the value is a dict via _trim_string_dict;
keep handling of k8s_pod_service_account_name unchanged.
In `@tests/integration/services/activation/engine/test_kubernetes.py`:
- Around line 329-366: The test test_engine_start_applies_k8s_resource_limits
only verifies when both k8s_mem_limit and k8s_cpu_limit are set; add
parametrized cases to also test memory-only and CPU-only configurations to catch
accidental use of an AND condition. Modify the test
(test_engine_start_applies_k8s_resource_limits) to parametrize inputs like
(k8s_mem_limit="200Mi", k8s_cpu_limit=None) and (k8s_mem_limit=None,
k8s_cpu_limit="500m"), call engine.start(request, log_handler) for each case
using the same capture_job mechanism, then assert on
created_body.spec.template.spec.containers[0].resources.limits that the present
limit exists and the absent limit is not set; keep the existing capture_job,
kubernetes_engine fixture and assertions structure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: f82180c1-f285-49c9-80cf-87118f90c9be
📒 Files selected for processing (12)
src/aap_eda/analytics/analytics_collectors.pysrc/aap_eda/api/serializers/activation.pysrc/aap_eda/core/migrations/0071_activation_k8s_pod_metadata.pysrc/aap_eda/core/models/activation.pysrc/aap_eda/core/utils/k8s_pod_metadata.pysrc/aap_eda/core/validators.pysrc/aap_eda/services/activation/engine/common.pysrc/aap_eda/services/activation/engine/kubernetes.pysrc/aap_eda/settings/defaults.pytests/integration/analytics/test_analytics_collectors.pytests/integration/services/activation/engine/test_kubernetes.pytests/unit/test_k8s_pod_metadata.py
| "k8s_pod_service_account_name", | ||
| "k8s_pod_labels", | ||
| "k8s_pod_annotations", | ||
| "k8s_pod_node_selector", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and examine the file structure
fd -t f "analytics_collectors.py"Repository: ansible/eda-server
Length of output: 164
🏁 Script executed:
# Read the file around lines 217-220 with surrounding context
wc -l src/aap_eda/analytics/analytics_collectors.pyRepository: ansible/eda-server
Length of output: 111
🏁 Script executed:
# Get context around the flagged lines
sed -n '200,235p' src/aap_eda/analytics/analytics_collectors.py | cat -nRepository: ansible/eda-server
Length of output: 1205
🏁 Script executed:
# Search for any telemetry privacy policies or documentation
fd -t f -e md -e txt -e rst | xargs grep -l "telemetry\|privacy\|export\|analytics" 2>/dev/null | head -20Repository: ansible/eda-server
Length of output: 147
🏁 Script executed:
# Search for comments or documentation about the fields being added
rg -B5 -A5 "k8s_pod_service_account_name|k8s_pod_labels|k8s_pod_annotations|k8s_pod_node_selector" src/aap_eda/analytics/Repository: ansible/eda-server
Length of output: 1104
🏁 Script executed:
# Get full context of the activations_table function
sed -n '190,230p' src/aap_eda/analytics/analytics_collectors.py | cat -nRepository: ansible/eda-server
Length of output: 1314
🏁 Script executed:
# Search for the function definition and understand the model
rg -B10 "k8s_service_name" src/aap_eda/analytics/analytics_collectors.py | head -40Repository: ansible/eda-server
Length of output: 361
🏁 Script executed:
# Check if there's any filtering or redaction logic
rg -i "redact|filter|sanitize|privacy|sensitive" src/aap_eda/analytics/analytics_collectors.pyRepository: ansible/eda-server
Length of output: 907
🏁 Script executed:
# Look for Activation model definition to understand source data
fd -t f "models.py\|models/__init__.py" | xargs grep -l "class Activation" 2>/dev/nullRepository: ansible/eda-server
Length of output: 46
🏁 Script executed:
# Search for Activation model in the codebase more broadly
rg "class Activation" -t py src/Repository: ansible/eda-server
Length of output: 2838
🏁 Script executed:
# Check for any privacy policy or telemetry documentation
find . -type f \( -name "*.md" -o -name "*.rst" -o -name "*.txt" \) | xargs grep -il "telemetry\|privacy\|analytics\|export" 2>/dev/null | head -10Repository: ansible/eda-server
Length of output: 155
🏁 Script executed:
# Check if k8s fields are populated from actual K8s API or user input
rg -B5 -A5 "k8s_pod_service_account_name\|k8s_pod_labels\|k8s_pod_annotations\|k8s_pod_node_selector" --max-count=50 src/Repository: ansible/eda-server
Length of output: 44
🏁 Script executed:
# Look at the full activations_table function
sed -n '175,250p' src/aap_eda/analytics/analytics_collectors.py | cat -nRepository: ansible/eda-server
Length of output: 2370
🏁 Script executed:
# Look at the Activation model definition
sed -n '1,100p' src/aap_eda/core/models/activation.py | cat -nRepository: ansible/eda-server
Length of output: 4275
🏁 Script executed:
# Search for the k8s fields in the Activation model
rg "k8s_pod_service_account_name|k8s_pod_labels|k8s_pod_annotations|k8s_pod_node_selector" src/aap_eda/core/models/activation.py -B2 -A2Repository: ansible/eda-server
Length of output: 588
🏁 Script executed:
# Check what type these fields are and if there's any documentation
rg "k8s_pod" src/aap_eda/core/models/activation.py | head -20Repository: ansible/eda-server
Length of output: 243
🏁 Script executed:
# Look for where these K8s fields are populated
rg -B3 -A3 "k8s_pod_service_account_name\s*=" src/ --max-count=20Repository: ansible/eda-server
Length of output: 2339
🏁 Script executed:
# Check the full Activation model for k8s field definitions
sed -n '100,200p' src/aap_eda/core/models/activation.py | cat -nRepository: ansible/eda-server
Length of output: 4017
🏁 Script executed:
# Check if these fields are documented or have explicit guidance
rg -B5 -A5 "k8s_pod_service_account_name|k8s_pod_labels|k8s_pod_annotations|k8s_pod_node_selector" src/aap_eda/core/models/activation.pyRepository: ansible/eda-server
Length of output: 1061
🏁 Script executed:
# Check deployment documentation for telemetry settings
cat docs/deployment.md | head -100Repository: ansible/eda-server
Length of output: 2958
🏁 Script executed:
# Search for any privacy-related comments or settings in analytics module
rg -i "privacy\|pii\|sensitive\|redact\|sanitize" src/aap_eda/analytics/ -A3Repository: ansible/eda-server
Length of output: 44
🏁 Script executed:
# Check if these K8s fields are exported in other analytics tables
rg "k8s_pod_service_account_name|k8s_pod_labels|k8s_pod_annotations|k8s_pod_node_selector" src/aap_eda/analytics/Repository: ansible/eda-server
Length of output: 375
🏁 Script executed:
# Check if there's any sanitization or filtering applied to these K8s fields in _copy_table function
rg -B10 -A10 "_copy_table" src/aap_eda/analytics/analytics_collectors.py | head -40Repository: ansible/eda-server
Length of output: 964
🏁 Script executed:
# Check if there's any documentation about what telemetry data is safe to export
find . -type f \( -name "*.md" -o -name "*.rst" -o -name "PRIVACY*" -o -name "SECURITY*" \) | xargs grep -il "telemetry\|analytics\|export\|data\|pii" 2>/dev/nullRepository: ansible/eda-server
Length of output: 183
Remove raw Kubernetes pod metadata from analytics export.
These fields (ServiceAccount name, labels, annotations, node selector) are operator-supplied configuration that can expose RBAC setup, cluster topology, and arbitrary annotation strings. Without explicit telemetry policy approval, they should be omitted.
Minimal safe default
"eda_system_vault_credential_id",
"k8s_service_name",
- "k8s_pod_service_account_name",
- "k8s_pod_labels",
- "k8s_pod_annotations",
- "k8s_pod_node_selector",
"source_mappings",
"skip_audit_events",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "k8s_pod_service_account_name", | |
| "k8s_pod_labels", | |
| "k8s_pod_annotations", | |
| "k8s_pod_node_selector", | |
| "eda_system_vault_credential_id", | |
| "k8s_service_name", | |
| "source_mappings", | |
| "skip_audit_events", |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/aap_eda/analytics/analytics_collectors.py` around lines 217 - 220, Remove
the raw Kubernetes pod metadata keys from the analytics export by deleting the
entries "k8s_pod_service_account_name", "k8s_pod_labels", "k8s_pod_annotations",
and "k8s_pod_node_selector" from the list/collection in
src/aap_eda/analytics/analytics_collectors.py where they are defined (search for
those exact string literals to find the variable holding exported fields).
Ensure any downstream consumers (e.g., functions or tests referencing those
keys) are updated to not expect them and add a comment near the list explaining
we intentionally omit operator-supplied pod metadata for privacy/telemetry
policy reasons.
Codecov Report❌ Patch coverage is
@@ Coverage Diff @@
## main #1534 +/- ##
==========================================
+ Coverage 91.92% 92.05% +0.12%
==========================================
Files 239 241 +2
Lines 10796 10982 +186
==========================================
+ Hits 9924 10109 +185
- Misses 872 873 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
- Annotation validation: enforce total size (256 KiB) across all keys and values combined instead of per-value limit, matching Kubernetes apimachinery behaviour. - Remove blanket 253-char cap on qualified metadata keys. Prefixed keys (prefix/name) can be up to 253+1+63=317 chars; validation now relies on prefix(<=253) and name(<=63) checks individually. - NodeSelector validation: validate keys as Kubernetes qualified names and values as label values (<=63 chars, name segment regex), not just string type checks. - Update tests accordingly. Signed-off-by: Artem Masolov <amasolov@redhat.com> Made-with: Cursor
a76a4c8 to
ed1814e
Compare
A 253-char prefix must be a valid DNS subdomain (labels <=63 chars separated by dots). The previous test used 253 'a' chars without dots, which fails the per-label length check. Signed-off-by: Artem Masolov <amasolov@redhat.com> Made-with: Cursor
New settings K8S_MEM_LIMIT and K8S_CPU_LIMIT (env: EDA_K8S_MEM_LIMIT, EDA_K8S_CPU_LIMIT) mirror the existing PODMAN_MEM_LIMIT pattern for Kubernetes deployments. When set, the Kubernetes engine applies resources.limits on the activation job container spec. Defaults to None (no limits); namespace LimitRange/ResourceQuota defaults apply as before. Signed-off-by: Artem Masolov <amasolov@redhat.com> Made-with: Cursor
Cover all three configurations (both, memory-only, CPU-only) to ensure an accidental AND condition in the engine cannot slip through. Signed-off-by: Artem Masolov <amasolov@redhat.com> Made-with: Cursor
ed1814e to
b658079
Compare
The /tmp env var was pre-existing; refactoring to a dict made SonarCloud see the line as new code. Revert to inline constructor and pass resources directly to avoid the hotspot. Signed-off-by: Artem Masolov <amasolov@redhat.com> Made-with: Cursor
|



Summary
K8S_MEM_LIMITandK8S_CPU_LIMITsettings (env:EDA_K8S_MEM_LIMIT,EDA_K8S_CPU_LIMIT) mirrorPODMAN_MEM_LIMITfor Kubernetes deployments.V1ResourceRequirements(limits=...)to the activation job container spec when set.None; namespace LimitRange/ResourceQuota defaults apply as before.Test plan
test_engine_start_applies_k8s_resource_limits: asserts memory and CPU limits are set on the container.test_engine_start_no_resource_limits_by_default: asserts no limits when settings areNone.Fixes #1533
Made with Cursor
Summary by CodeRabbit
New Features
Tests