Skip to content

net, stability, sriov: Split setup test_connectivity_of_hot_plugged_sriov_interface#4660

Open
azhivovk wants to merge 1 commit intoRedHatQE:mainfrom
azhivovk:fix_sriov_fail
Open

net, stability, sriov: Split setup test_connectivity_of_hot_plugged_sriov_interface#4660
azhivovk wants to merge 1 commit intoRedHatQE:mainfrom
azhivovk:fix_sriov_fail

Conversation

@azhivovk
Copy link
Copy Markdown
Contributor

@azhivovk azhivovk commented Apr 29, 2026

test_connectivity_of_hot_plugged_sriov_interface is failing:
VMInterfaceStatusNotFoundError: Network interface named sriov-hot-plug-test-network was not found in VM sriov-hot-plug-vm1
And VM1 must-gather logs are missing due to early teardown, which makes it difficult to debug and understand the root cause of the failure.

VM1 is torndown by exit of Resource class (which VirtualMachineForTests inherits from).
Context managers unwind as the exception propagates through the stack, before any outer exception handler (pytest's hook) can see it.
Currently the same fixture creates the VM (via with VirtualMachineForTests(...)) and performs the hot-plug. When hot-plug fails, Python's exception propagation triggers VirtualMachineForTests.__exit__ (inherited from Resource), which deletes the VM before the exception reaches pytest's pytest_exception_interact hook. As a result, must-gather runs after the VM is already deleted and cannot capture its state.

The fix is to use separate fixtures for VM creation and SR-IOV interface hotplug - then must-gather runs right after setup error.

Summary by CodeRabbit

  • Tests
    • Refactored SR-IOV hot-plug tests to remove a shared helper and split responsibilities into separate per-VM setup and hot-plug fixtures.
    • Updated test setup so VMs are created and SR-IOV interfaces are attached via distinct steps, improving clarity and fixture dependency flow.
    • Normalized network reference formatting used during hot-plugging.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

Removed a shared context-manager helper that created a VM and hot-plugged an SR-IOV interface; tests were refactored to create the VM and perform SR-IOV hot-plugging via separate pytest fixtures and updated network-attachment formatting.

Changes

SR-IOV hot-plug test refactor

Layer / File(s) Summary
Helper removal
tests/network/l2_bridge/libl2bridge.py
Deleted create_vm_with_hot_plugged_sriov_interface(namespace_name, vm_name, sriov_network_for_hot_plug, ipv4_address, client).
VM creation fixture
tests/network/l2_bridge/test_bridge_nic_hot_plug.py
Added sriov_hot_plug_vm1(namespace, unprivileged_client) and sriov_hot_plug_vm2(namespace, unprivileged_client) that call create_vm_for_hot_plug(...) to produce VMs for hot-plugging.
Hot-plug wiring / address assignment
tests/network/l2_bridge/test_bridge_nic_hot_plug.py
Reworked vm1_with_hot_plugged_sriov_interface(...) and vm2_with_hot_plugged_sriov_interface(...) to depend on sriov_hot_plug_vm1/sriov_hot_plug_vm2 and to call hot_plug_interface_and_set_address(..., sriov=True) with net_attach_def_name formatted as "{namespace.name}/{sriov_network_for_hot_plug.name}".
Imports / module wiring
tests/network/l2_bridge/test_bridge_nic_hot_plug.py
Removed import of create_vm_with_hot_plugged_sriov_interface from tests.network.l2_bridge.libl2bridge and adjusted fixture signatures accordingly.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly reflects the main change: splitting setup for the test_connectivity_of_hot_plugged_sriov_interface test by separating VM creation and SR-IOV hot-plug fixtures.
Description check ✅ Passed The description clearly explains the problem (VM teardown before must-gather), root cause, and the fix (separate fixtures), but lacks some template sections like jira-ticket and special reviewer notes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: A tracking issue is created for this PR and will be closed when the PR is merged or closed
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: Enabled categories: branch, can-be-merged, cherry-pick, has-conflicts, hold, needs-rebase, size, verified, wip

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest verify-bugs-are-open - verify-bugs-are-open
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 2 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No wip, hold, has-conflicts labels and PR must be mergeable (no conflicts)
  5. Verified: PR must be marked as verified

📊 Review Process

Approvers and Reviewers

Approvers:

  • EdDev

Reviewers:

  • Anatw
  • EdDev
  • azhivovk
  • frenzyfriday
  • nirdothan
  • orelmisan
  • servolkov
  • yossisegev
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
AI Features
  • Cherry-Pick Conflict Resolution: Enabled (claude/claude-opus-4-6[1m])

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is removed on new commits unless the push is detected as a clean rebase
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

coderabbitai[bot]
coderabbitai Bot previously approved these changes Apr 29, 2026
@azhivovk
Copy link
Copy Markdown
Contributor Author

/verified
uv run pytest --verbose -rs -s -o log_cli=true --leftovers-collector --tb=native --skip-deprecated-api-test --cluster-sanity-skip-storage-check tests/network/l2_bridge/test_bridge_nic_hot_plug.py::TestHotPlugInterfaceToVmWithOnlyPrimaryInterface::test_connectivity_of_hot_plugged_sriov_interface
Result: 1 passed, 1 deselected, 2 warnings in 632.32s (0:10:32)

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

@coderabbitai

Test execution plan request details

CRITICAL: You MUST post an inline review comment on the first changed line of the first file.
The inline comment should contain the Test Execution Plan with the smoke tests recommendation.
Do NOT submit a formal review - just post the inline comment directly.

As an expert software testing engineer, analyze all modified files in this PR and create a targeted test execution plan.
You will post an inline review comment with the test execution plan on the first changed file.
If you fail to run or post a comment, retry.

Analysis Requirements:

  1. Examine code changes in each modified file
  2. Identify affected code paths, functions, and classes
  3. Analyze pytest-specific elements: fixtures (scope, dependencies), parametrization, markers, conftest changes
  4. Trace test dependencies through imports, shared utilities, and fixture inheritance
  5. Detect new tests introduced in the PR
  6. Smoke Test Impact Analysis: Determine if any changes could affect smoke tests by checking:
    • Changes to files/functions used by tests marked with @pytest.mark.smoke
    • Changes to fixtures or utilities imported by smoke tests
    • Changes to conftest.py files that may affect smoke test execution
    • Changes to core infrastructure code (utilities/, libs/) that smoke tests depend on

Your deliverable:
Your change request comment will be based on the following requirements:

Test Execution Plan

  • Run smoke tests: True / False (Based on your smoke test impact analysis - True if any changes may affect smoke tests)
  • path/to/test_file.py - When the entire test file needs verification
  • path/to/test_file.py::TestClass::test_method - When specific test(s) needed
  • path/to/test_file.py::test_function - When specific test(s) needed
  • -m marker - When specific marker(s) can be used to cover multiple cases.

Guidelines:

  • Include only tests directly affected by the changes
  • Use a full file path only if ALL tests in that file require verification
  • Use file path + test name if only specific tests are needed
  • If a test marker can cover multiple files/tests, provide the marker
  • Balance coverage vs over-testing - Keep descriptions minimal

Hardware-Related Checks (SR-IOV, GPU, DPDK):

When PR modifies fixtures for hardware-specific resources:

  • Collection Safety: Fixtures MUST have existence checks (return None when hardware unavailable)
  • Test Plan: MUST verify both WITH and WITHOUT hardware:
    • Run affected tests on cluster WITH hardware
    • Verify collection succeeds on cluster WITHOUT hardware

CRITICAL WORKFLOW COMPLETION RULES:

When responding to this test execution plan request, you MUST follow these rules EXACTLY:

  1. YOUR ONLY DELIVERABLE: Post an inline review comment containing the test execution plan on the first changed line
  2. THEN STOP IMMEDIATELY - Do NOT generate any additional response
  3. FALLBACK ONLY: If submitting the review fails after retrying, post as a regular PR comment
  4. SILENCE = SUCCESS: After successfully submitting the review, your task is complete. No confirmation needed.

ABSOLUTE PROHIBITIONS (violating these creates empty/meaningless reviews):

  • ❌ Do NOT post acknowledgment messages like "Test execution plan posted", "Review posted successfully", "I've successfully posted"
  • ❌ Do NOT mention review IDs, URLs, or confirmation of posting in the PR thread
  • ❌ Do NOT add any follow-up comments after submitting the review
  • ❌ Do NOT reply to confirm task completion
  • ❌ Do NOT explain what you did - just do it and stop

Remember: The pull request review is visible to users. Additional comments are redundant noise.

CRITICAL: Before setting "Run smoke tests: True", you MUST use shell scripts to:
- Trace the actual fixture dependency chain from smoke tests to changed fixtures
- Verify that smoke tests actually import/use changed utilities or functions
- Confirm the dependency path exists; do NOT assume based on scope or semantics
- Be conservative: Session-scoped fixtures or infrastructure-sounding names do NOT
automatically mean smoke test impact. Only flag smoke test impact when you can
demonstrate a concrete dependency path.
WRONG: "This session-scoped storage fixture might affect smoke tests"
RIGHT: "Smoke test X uses fixture Y, which depends on the changed fixture Z"

Comment thread tests/network/l2_bridge/libl2bridge.py
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get PR diff to find changed files and line numbers
gh pr diff 4660 --repo RedHatQE/openshift-virtualization-tests

echo "---SMOKE TESTS---"
# Check if any smoke tests import or use libl2bridge
rg "pytest.mark.smoke" --type=py -l

echo "---L2BRIDGE IMPORTS---"
# Check if libl2bridge is imported by smoke tests
rg "libl2bridge|hot_plug_interface|wait_for_vm_interfaces" --type=py -l

echo "---SMOKE TEST FILES---"
rg "pytest.mark.smoke" --type=py -B5 | grep "def test_" | head -30

Length of output: 2078


🏁 Script executed:

#!/bin/bash
# Get the latest commit SHA for the PR
COMMIT_SHA=$(gh pr view 4660 --repo RedHatQE/openshift-virtualization-tests --json headRefOid -q '.headRefOid')
echo "Commit SHA: $COMMIT_SHA"

# Post inline review comment on line 29 of libl2bridge.py (first changed line)
gh api \
  --method POST \
  -H "Accept: application/vnd.github+json" \
  /repos/RedHatQE/openshift-virtualization-tests/pulls/4660/reviews \
  -f "commit_id=$COMMIT_SHA" \
  -f 'event=COMMENT' \
  -f 'comments[0][path]=tests/network/l2_bridge/libl2bridge.py' \
  -F 'comments[0][line]=29' \
  -f 'comments[0][side]=RIGHT' \
  -f 'comments[0][body]=**Test Execution Plan**

- **Run smoke tests: False** _(No dependency path found from smoke tests to `libl2bridge.py` or `wait_for_vm_interfaces` in the hot-plug context)_

- `tests/network/l2_bridge/test_bridge_nic_hot_plug.py::TestHotPlugInterfaceToVmWithOnlyPrimaryInterface::test_connectivity_of_hot_plugged_sriov_interface` — directly exercises the modified `hot_plug_interface` flow (SR-IOV cluster required)'

Length of output: 1396


🏁 Script executed:

#!/bin/bash
COMMIT_SHA="bc4cb703f4fb86a063b36ecb6a834c0df471ccdd"

BODY="**Test Execution Plan**

- **Run smoke tests: False** _(No dependency path from smoke tests to \`libl2bridge.py\` or \`wait_for_vm_interfaces\` verified)_

- \`tests/network/l2_bridge/test_bridge_nic_hot_plug.py::TestHotPlugInterfaceToVmWithOnlyPrimaryInterface::test_connectivity_of_hot_plugged_sriov_interface\` — directly exercises the modified \`hot_plug_interface\` flow (SR-IOV cluster required)"

python3 - <<'EOF'
import subprocess, json

commit_sha = "bc4cb703f4fb86a063b36ecb6a834c0df471ccdd"
body = """**Test Execution Plan**

- **Run smoke tests: False** _(No dependency path from smoke tests to `libl2bridge.py` or `wait_for_vm_interfaces` verified)_

- `tests/network/l2_bridge/test_bridge_nic_hot_plug.py::TestHotPlugInterfaceToVmWithOnlyPrimaryInterface::test_connectivity_of_hot_plugged_sriov_interface` — directly exercises the modified `hot_plug_interface` flow (SR-IOV cluster required)"""

payload = {
    "commit_id": commit_sha,
    "event": "COMMENT",
    "comments": [
        {
            "path": "tests/network/l2_bridge/libl2bridge.py",
            "line": 29,
            "side": "RIGHT",
            "body": body
        }
    ]
}

result = subprocess.run(
    ["gh", "api", "--method", "POST",
     "-H", "Accept: application/vnd.github+json",
     "/repos/RedHatQE/openshift-virtualization-tests/pulls/4660/reviews",
     "--input", "-"],
    input=json.dumps(payload).encode(),
    capture_output=True
)
print(result.stdout.decode())
print(result.stderr.decode())
EOF

Length of output: 1737



🧠 Learnings used
Learnt from: rnetser
Repo: RedHatQE/openshift-virtualization-tests PR: 0
File: :0-0
Timestamp: 2025-12-16T10:28:54.212Z
Learning: In the openshift-virtualization-tests repository, when determining smoke test impact for test execution plans, only set "Run smoke tests: True" if there is a verified, traceable dependency path from smoke tests to the changed code. Session-scoped fixtures or infrastructure-sounding changes do NOT automatically imply smoke test impact - the dependency chain must be explicitly verified using shell scripts before recommending smoke test execution.

Learnt from: azhivovk
Repo: RedHatQE/openshift-virtualization-tests PR: 4578
File: tests/network/l2_bridge/nad_ref_change/test_nad_ref_change.py:64-79
Timestamp: 2026-04-26T11:48:11.694Z
Learning: In `tests/network/l2_bridge/nad_ref_change/test_nad_ref_change.py`, the negative test `test_non_migratable_vm_nad_change_not_applied` uses connectivity-only Expected assertions (not a rejection signal) because the NAD live-update feature does NOT expose a rejection/error signal for non-migratable VMs. The only verifiable outcome is that the VM remains connected to the original network with no connectivity to the target VLAN. Do not flag the absence of a rejection-signal assertion in this test as a defect.

Learnt from: rnetser
Repo: RedHatQE/openshift-virtualization-tests PR: 3062
File: conftest.py:333-333
Timestamp: 2025-12-16T20:11:03.645Z
Learning: In the openshift-virtualization-tests repository, when determining smoke test impact for changes affecting py_config["os_login_param"], follow this verification methodology: (1) Find all smoke tests: `rg "pytest.mark.smoke" --type=py -B2 | grep "def test_"`, (2) For each smoke test file, search for VM creation patterns: `rg "VirtualMachineForTests|running_vm|VirtualMachineForTestsFromTemplate|wait_for_ssh|check_ssh_connectivity"`, (3) Trace the dependency chain: smoke test → VirtualMachineForTests/running_vm() → wait_for_ssh_connectivity() (default enabled) → vm.login_params property → py_config["os_login_param"][vm.os_flavor], (4) Check utilities/virt.py for login_params usage: `rg "os_login_param|login_params" utilities/virt.py -C3`. Any smoke test creating VMs with default SSH connectivity checks (running_vm with check_ssh_connectivity=True) depends on os_login_param, even if the test doesn't directly reference it.

Learnt from: jpeimer
Repo: RedHatQE/openshift-virtualization-tests PR: 0
File: :0-0
Timestamp: 2025-12-22T15:56:00.157Z
Learning: In the openshift-virtualization-tests repository, when responding to test execution plan requests from openshift-virtualization-qe-bot-3, do NOT use "REQUEST_CHANGES" review type if the PR author has already marked the PR as verified (e.g., with `/verified` command). Test execution plans are informational guides, not blocking requirements. Use COMMENT event for informational test plans, or only REQUEST_CHANGES if there are actual code issues that need to be addressed before merging.

Learnt from: azhivovk
Repo: RedHatQE/openshift-virtualization-tests PR: 4147
File: tests/network/upgrade/test_upgrade_network.py:166-177
Timestamp: 2026-03-19T10:37:02.008Z
Learning: In the openshift-virtualization-tests repository, unused fixture parameters in test methods (e.g., `bridge_on_one_node` in `tests/network/upgrade/test_upgrade_network.py`) that exist purely for pytest dependency ordering should not be flagged for removal, and no follow-up issues need to be opened for them. This is an intentional pattern, consistent with the analogous convention for fixture definitions in this codebase.

Learnt from: rnetser
Repo: RedHatQE/openshift-virtualization-tests PR: 0
File: :0-0
Timestamp: 2025-09-29T19:05:24.987Z
Learning: For PR `#1904` test execution, the critical validation point is test_connectivity_over_migration_between_localnet_vms which should fail gracefully on cloud clusters but pass on bare-metal/PSI clusters, representing the core nmstate conditional logic functionality.

Learnt from: rnetser
Repo: RedHatQE/openshift-virtualization-tests PR: 0
File: :0-0
Timestamp: 2025-12-16T14:00:59.076Z
Learning: In the openshift-virtualization-tests repository, when responding to test execution plan requests from openshift-virtualization-qe-bot-3, CodeRabbit must post ONLY an inline review comment on the Files Changed tab and then stop immediately without generating any follow-up comments in the PR discussion thread. No acknowledgment messages, no confirmation of posting, no explanation - silence after posting the inline review equals success. Additional comments create empty/meaningless reviews that clutter the PR.

Learnt from: rnetser
Repo: RedHatQE/openshift-virtualization-tests PR: 3062
File: conftest.py:333-333
Timestamp: 2025-12-16T15:09:49.597Z
Learning: In the openshift-virtualization-tests repository, when conftest.py or utilities/bitwarden.py changes affect py_config["os_login_param"], smoke test impact must be determined by: (1) finding all smoke tests using `rg "pytest.mark.smoke"`, (2) checking each for VM creation patterns (VirtualMachineForTests, running_vm, VirtualMachineForTestsFromTemplate), (3) tracing whether running_vm is called with default check_ssh_connectivity=True, which accesses vm.login_params property that reads py_config["os_login_param"][vm.os_flavor]. The dependency chain is: smoke test → VM creation → running_vm → wait_for_ssh_connectivity → vm.login_params → os_login_param. Any smoke test creating VMs with SSH connectivity (the default) depends on os_login_param.

Learnt from: azhivovk
Repo: RedHatQE/openshift-virtualization-tests PR: 3598
File: tests/network/sriov/conftest.py:62-71
Timestamp: 2026-01-28T17:36:14.188Z
Learning: In the RedHatQE/openshift-virtualization-tests repository, sanity checks validate that the cluster supports at least one IP family (IPv4 or IPv6) before SR-IOV tests run. Therefore, defensive guards checking for empty IP addresses in SR-IOV VM fixtures (like `sriov_vm` in tests/network/sriov/conftest.py) are not necessary—if neither `ipv4_supported_cluster` nor `ipv6_supported_cluster` is True, the cluster is considered broken and will fail sanity validation.

Learnt from: yossisegev
Repo: RedHatQE/openshift-virtualization-tests PR: 3873
File: tests/network/localnet/test_non_udn_localnet.py:7-9
Timestamp: 2026-02-25T10:52:23.378Z
Learning: In the RedHatQE/openshift-virtualization-tests repository, networking infrastructure requirements such as nmstate, localnet bridge mappings, and NIC availability are NOT considered special/platform-specific infrastructure that would warrant a tier3 marker. These are standard test environment capabilities in this repository's test environment, and tests requiring them do not automatically qualify as tier3 unless they involve other truly platform-specific/complex/time-consuming characteristics beyond standard network test infrastructure.

Learnt from: rnetser
Repo: RedHatQE/openshift-virtualization-tests PR: 0
File: :0-0
Timestamp: 2025-12-16T14:06:22.391Z
Learning: In the openshift-virtualization-tests repository, when posting test execution plan inline review comments using GitHub API, the full test execution plan content must go in the `comments[].body` field (which appears on Files Changed tab), NOT in the top-level `body` field (which appears in PR discussion thread). The top-level `body` field should be omitted or left empty to avoid posting redundant comments in the PR discussion thread.

Learnt from: rnetser
Repo: RedHatQE/openshift-virtualization-tests PR: 0
File: :0-0
Timestamp: 2025-09-29T19:05:24.987Z
Learning: The test execution plan for PR `#1904` focuses on cluster-type conditional logic where nmstate functionality is bypassed on cloud clusters (Azure/AWS) but fully functional on bare-metal/PSI clusters, requiring different test strategies for each environment type.

Learnt from: CR
Repo: RedHatQE/openshift-virtualization-tests PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-27T15:43:46.511Z
Learning: Applies to tests/**/*.py : New feature tests MUST follow STD-first workflow: STP review, then STD review, then implementation. NEVER submit test implementation without prior STD review

Learnt from: yossisegev
Repo: RedHatQE/openshift-virtualization-tests PR: 3358
File: tests/network/sriov/test_sriov.py:21-21
Timestamp: 2026-01-07T09:52:22.771Z
Learning: For PRs that only remove or modify pytest markers (like removing pytest.mark.post_upgrade) without changing test logic in openshift-virtualization-tests, collection verification using `pytest --collect-only` is sufficient. Full test execution is not required to verify marker-only changes.

Learnt from: EdDev
Repo: RedHatQE/openshift-virtualization-tests PR: 3612
File: tests/network/user_defined_network/ip_specification/test_ip_specification.py:10-10
Timestamp: 2026-01-26T20:29:54.623Z
Learning: In the RedHatQE/openshift-virtualization-tests repository, when introducing STD (Standard Test Definition) PRs that define test case structure without implementation, it's acceptable and intentional to use `__test__ = False` to block pytest collection until the tests are fully implemented in a follow-up PR. The STD workflow involves: (1) defining test structure, markers, and documentation first, (2) implementing the actual test logic and removing the collection blocker in a subsequent PR.

Learnt from: dshchedr
Repo: RedHatQE/openshift-virtualization-tests PR: 4118
File: utilities/database.py:0-0
Timestamp: 2026-03-17T01:32:13.674Z
Learning: In RedHatQE/openshift-virtualization-tests, maintainer dshchedr prefers CodeRabbit to post targeted inline comments on the Files changed tab for each applicable location rather than aggregating multiple issues into a single discussion thread.

Learnt from: dshchedr
Repo: RedHatQE/openshift-virtualization-tests PR: 4118
File: utilities/database.py:0-0
Timestamp: 2026-03-17T01:32:13.674Z
Learning: In RedHatQE/openshift-virtualization-tests, maintainer dshchedr prefers CodeRabbit to post targeted inline comments on the Files Changed tab at each applicable location rather than aggregating multiple issues into a single PR discussion thread reply.

Learnt from: servolkov
Repo: RedHatQE/openshift-virtualization-tests PR: 1776
File: libs/net/node_network.py:25-31
Timestamp: 2025-08-20T23:43:28.117Z
Learning: In the RedHatQE/openshift-virtualization-tests project, servolkov's team always uses bare metal (BM) clusters with IPv4 setup in their testing environment, making defensive checks for IPv4 data presence potentially redundant in their networking code.

Learnt from: rnetser
Repo: RedHatQE/openshift-virtualization-tests PR: 3371
File: scripts/tests_analyzer/compare_coderabbit_decisions.py:199-289
Timestamp: 2026-01-13T10:06:14.822Z
Learning: In the openshift-virtualization-tests repository, user rnetser prefers to keep pagination loops inline rather than extracting them into generic helper functions when the loops have different URL patterns and unique post-processing logic, as the inline approach improves readability and makes each endpoint's behavior more explicit.

Learnt from: rnetser
Repo: RedHatQE/openshift-virtualization-tests PR: 3228
File: .coderabbit.yaml:30-41
Timestamp: 2026-01-05T10:33:55.037Z
Learning: In the openshift-virtualization-tests repository, user rnetser prefers minimal pre-merge checks in CodeRabbit configuration: only docstrings enforcement (80% threshold) is needed, not title or description checks.

Learnt from: rnetser
Repo: RedHatQE/openshift-virtualization-tests PR: 2469
File: utilities/sanity.py:139-142
Timestamp: 2025-11-08T07:36:57.616Z
Learning: In the openshift-virtualization-tests repository, user rnetser prefers to keep refactoring PRs (like PR `#2469`) strictly focused on moving/organizing code into more granular modules without adding new functionality, error handling, or behavioral changes. Such improvements should be handled in separate PRs.

Learnt from: jpeimer
Repo: RedHatQE/openshift-virtualization-tests PR: 1160
File: tests/storage/storage_migration/test_mtc_storage_class_migration.py:165-176
Timestamp: 2025-06-17T07:45:37.776Z
Learning: In the openshift-virtualization-tests repository, user jpeimer prefers explicit fixture parameters over composite fixtures in test methods, even when there are many parameters, as they find this approach more readable and maintainable for understanding test dependencies.

Learnt from: jpeimer
Repo: RedHatQE/openshift-virtualization-tests PR: 3571
File: tests/storage/storage_migration/utils.py:158-167
Timestamp: 2026-01-25T13:18:26.819Z
Learning: In the openshift-virtualization-tests repository, user jpeimer prefers to avoid nitpicky style changes (e.g., removing `.keys()` from dict membership checks) because verifying every change is expensive. Be cautious about suggesting low-impact stylistic improvements that require verification overhead.

Learnt from: akri3i
Repo: RedHatQE/openshift-virtualization-tests PR: 1210
File: tests/virt/cluster/general/mass_machine_type_transition_tests/conftest.py:24-64
Timestamp: 2025-06-23T19:28:20.281Z
Learning: In OpenShift Virtualization mass machine type transition tests, the machine type glob pattern "pc-q35-rhel8.*.*" is intentionally hard-coded in the kubevirt_api_lifecycle_automation_job as it's used only once for this specific test case, with plans to update it in the future if the job needs to support other machine types.

Learnt from: RoniKishner
Repo: RedHatQE/openshift-virtualization-tests PR: 1411
File: utilities/os_utils.py:246-279
Timestamp: 2025-07-22T17:13:59.166Z
Learning: In the RedHatQE/openshift-virtualization-tests repository, CentOS preferences follow the format "centos-stream<version>" (e.g., "centos-stream9", "centos-stream10"). The generate_instance_type_centos_os_matrix function correctly uses regex to extract numeric versions and constructs the latest version string in the same format as the input preferences for proper comparison.

Learnt from: EdDev
Repo: RedHatQE/openshift-virtualization-tests PR: 3649
File: tests/network/user_defined_network/ip_specification/conftest.py:32-48
Timestamp: 2026-01-29T15:01:54.833Z
Learning: In tests/network/user_defined_network/ip_specification/conftest.py and similar VM fixtures, calling `vm.wait_for_agent_connected()` after `vm.start()` is the correct and sufficient pattern. There is no need to add `wait_for_ready_status(status=True)` before it, because wait_for_agent_connected() already verifies the guest OS is running (not just powered on), which is a stronger validation than ready status.

Learnt from: EdDev
Repo: RedHatQE/openshift-virtualization-tests PR: 3930
File: tests/network/vmi/libippersistence.py:79-88
Timestamp: 2026-02-23T11:39:07.449Z
Learning: In tests/network/vmi/libippersistence.py, the monitor_vmi_interfaces helper function should accept an optional expected_interface_count parameter (default=2) to allow tests with different NIC configurations to reuse the monitoring logic without hardcoding the interface count.

Learnt from: EdDev
Repo: RedHatQE/openshift-virtualization-tests PR: 3930
File: tests/network/vmi/libippersistence.py:79-88
Timestamp: 2026-02-23T11:39:07.449Z
Learning: In tests/network/vmi/libippersistence.py, the monitor_vmi_interfaces helper function should accept an optional expected_interface_count parameter (default=2) to allow tests with different NIC configurations to reuse the monitoring logic without hardcoding the interface count.

Learnt from: vamsikrishna-siddu
Repo: RedHatQE/openshift-virtualization-tests PR: 2199
File: tests/storage/test_online_resize.py:108-113
Timestamp: 2025-09-28T14:43:07.181Z
Learning: In the openshift-virtualization-tests repo, PR `#2199` depends on PR `#2139` which adds architecture-specific OS_FLAVOR attributes to the Images.Cirros class (OS_FLAVOR_CIRROS for x86_64/ARM64, OS_FLAVOR_FEDORA for s390x), enabling conditional logic based on the underlying OS flavor in tests.

Learnt from: geetikakay
Repo: RedHatQE/openshift-virtualization-tests PR: 4113
File: utilities/virt.py:150-150
Timestamp: 2026-03-23T14:45:02.812Z
Learning: In utilities/virt.py, the condition `reported_names == expected_names` in `wait_for_vm_interfaces` uses intentional exact-set matching (not subset `<=`). This is by design because `status.interfaces` are derived from the VM spec via the KubeVirt guest agent, so extra interfaces are not expected in practice. Using subset matching would mask unexpected interfaces that should be investigated. The strict equality check is the correct invariant for this function.

Copy link
Copy Markdown
Contributor

@yossisegev yossisegev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Asia, this should significantly contribute to our stability.

Comment thread tests/network/l2_bridge/libl2bridge.py Outdated

update_hot_plug_config_in_vm(vm=vm, interfaces=interfaces, networks=networks)

wait_for_vm_interfaces(vmi=vm.vmi)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though we need to consider the time of the "shadow" migration in this wait, 12 minutes (wait_for_vm_interfaces's default timeout value) seems quite a lot; if something went wrong, we better fail earlier.
@servolkov @Anatw @EdDev WDYT? Maybe I am not being realistic here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could lower the timeout - according to the logs the time difference until the interface appeared was 30-60 seconds

@azhivovk azhivovk changed the title net, stability, sriov: Wait for interface post hotplug net, stability, sriov: Fix test_connectivity_of_hot_plugged_sriov_interface Apr 30, 2026
@openshift-virtualization-qe-bot-5
Copy link
Copy Markdown

New container for quay.io/openshift-cnv/openshift-virtualization-tests:pr-4660 published

@rnetser
Copy link
Copy Markdown
Collaborator

rnetser commented May 3, 2026

During this timeout we expect the seamless migration post hotplug to complete and guest agent to connect and VMI status updated by the virt-handler. This is exactly the reason why you should have checks on migration. You just increase the timeout without really knowing what is going on and have no clear visibility into potential failures during the migration which does not require digging into must gather logs. you inrease the timeout to 4! minutes while the migration could be stuck/failed and you will wait for nothing
@EdDev @servolkov

@azhivovk @rnetser 's comment here is valid. I am going over the 4.21 docs about interface hot-plug, and I see that we now explicitly declare that the hot-plug action is followed and completed by a migration. This means that the users are aware of this background migration, which means that we can/should also consider it, and wait for it to be completed (AFAIR this wasn't always the case, but I am glad that we can stop ignoring this migration and rely on it). @EdDev @orelmisan Please keep me honest here. Thank you

@yossisegev In KubeVirt upstream tests we ignore the migration and just wait for the interfaces to appear under VMI.Status.Interfaces [1]. The migration is an implementation detail and intermediate - I think the test should verify the outcome (interface is available), not the mechanism used to achieve it.
[1] kubevirt/kubevirt#15209

Thanks for addressing it and responding @orelmisan

The migration is an implementation detail and intermediate - I think the test should verify the outcome (interface is available), not the mechanism used to achieve it.

Originally, that was my approach as well, but I think that the fact that we do refer to this migration in the user docs (see step 3 in the Procedure section, and the following Verification) implies that the migration is reflected to the user, and therefore can and should be also considered in the tests.

@yossisegev I see your point about the docs mentioning the migration, but I think there's an important distinction here. In v4.20+, the migration is fully automatic - users don't initiate it, don't need permission to create VirtualMachineInstanceMigration objects, and don't need to be aware of it for the hot-plug to succeed. The 4.20 release notes explicitly call this out: "You can now hot plug and hot unplug a secondary network interface to a VM without manually triggering live migration."

If we're testing from the user's perspective, the user's action is "hot-plug an interface" and the expected outcome is "the interface is available." The migration is an internal mechanism the system uses to achieve that outcome. The docs may mention it for transparency, but the user doesn't need to interact with it or verify it separately.

Adding migration checks to this test would couple it to an implementation detail that could change and would make the test verify how the system works rather than what it delivers to the user.

my point here is that while we test from user perspective, we do need to keep in mind that when something fails we need to have the proper data to review the failure and analyze.
you are not testing migration, you are adding a checkpoint to save time.
raising this as my comment was added when the timeout was raised from to 2 to 4 minutes for the sake of (iiuc) let's give it more time, hopefully it will be ok.
now i see it is down to 1 minute; what happens in slower envs? will it work?
it is up to the team to decide here how they want to handle this

@azhivovk
Copy link
Copy Markdown
Contributor Author

azhivovk commented May 3, 2026

@rnetser Regarding the 1 minute - I put this PR on WIP so please ignore this timeout change.

I wanted to run these changes in the pytest runner to see if the fixture change helped get must-gather logs collection at the correct place, since in the must-gather logs from the failure in CI, VM2 (where the hotplug failed) was torndown before must-gather took action and we couldn't analyze the logs properly.

We agreed with @EdDev to merge the fixtures change if it really helps with the logs and then we could get logs regarding the failure of the sriov hotplug in the next run in CI.
I'll update the PR accordingly.

@azhivovk azhivovk changed the title WIP: net, stability, sriov: Fix test_connectivity_of_hot_plugged_sriov_interface WIP: net, stability, sriov: Split setup test_connectivity_of_hot_plugged_sriov_interface May 3, 2026
@azhivovk
Copy link
Copy Markdown
Contributor Author

azhivovk commented May 3, 2026

Must-gather logs on the failed test in CI - VM1 is torndown early and logs are missing:

Console:

23:03:56  2026-04-29T20:03:55.067113+00:00 ocp_resources VirtualMachineInstance INFO Wait until VirtualMachineInstance sriov-hot-plug-vm1-1777493018-2466836 is created
......
23:06:29  2026-04-29T20:06:29.549189+00:00 ocp_resources VirtualMachine INFO Wait until VirtualMachine sriov-hot-plug-vm1-1777493018-2466836 is deleted
......
23:06:30  TEST: TestHotPlugInterfaceToVmWithOnlyPrimaryInterface.test_connectivity_of_hot_plugged_sriov_interface [setup] STATUS: ERROR
......
23:06:30  2026-04-29T20:06:30.581785 conftest INFO [DATA_COLLECTOR] Must-gather collection is enabled for /openshift-virtualization-tests/tests/network/l2_bridge/test_bridge_nic_hot_plug.py::test_connectivity_of_hot_plugged_sriov_interface.


Must-gather:

$ omc get vmi -A
NAMESPACE                            NAME                                     AGE   PHASE     IP            NODENAME                                  READY   LIVE-MIGRATABLE   PAUSED
l2-bridge-test-bridge-nic-hot-plug   utility-hot-plug-vm-1777492661-4777932   3d    Running   10.129.0.44   cnv-qe-14.cnvqe.lab.eng.rdu2.redhat.com   True    True    

With the current fix we do collect the failed VM1 logs right after the error (with intentional failure in lookup_iface_status) and VM1 is not deleted before must-gather run:
Console:

17:29:57  2026-05-03T14:29:57.147168+00:00 ocp_resources VirtualMachineInstance INFO Wait until VirtualMachineInstance sriov-hot-plug-vm1-1777818564-2637753 is created
.....
17:30:53  TEST: TestHotPlugInterfaceToVmWithOnlyPrimaryInterface.test_connectivity_of_hot_plugged_sriov_interface [setup] STATUS: ERROR
......
17:30:53  2026-05-03T14:30:50.334805 conftest INFO [DATA_COLLECTOR] Must-gather collection is enabled for /openshift-virtualization-tests/tests/network/l2_bridge/test_bridge_nic_hot_plug.py::test_connectivity_of_hot_plugged_sriov_interface.

Must-gather:

$ omc get vmi -A
NAMESPACE                            NAME                                    AGE   PHASE     IP             NODENAME                           READY   LIVE-MIGRATABLE   PAUSED
l2-bridge-test-bridge-nic-hot-plug   sriov-hot-plug-vm1-1777818564-2637753   1h    Running   10.129.0.186   cnvqe-12.lab.eng.tlv2.redhat.com   True    True       

@azhivovk
Copy link
Copy Markdown
Contributor Author

azhivovk commented May 3, 2026

Change: Drop all changes and keep fixtures change only

@azhivovk
Copy link
Copy Markdown
Contributor Author

azhivovk commented May 3, 2026

/wip cancel

@openshift-virtualization-qe-bot openshift-virtualization-qe-bot changed the title WIP: net, stability, sriov: Split setup test_connectivity_of_hot_plugged_sriov_interface net, stability, sriov: Split setup test_connectivity_of_hot_plugged_sriov_interface May 3, 2026
test_connectivity_of_hot_plugged_sriov_interface is failing:
VMInterfaceStatusNotFoundError: Network interface named
sriov-hot-plug-test-network was not found in VM
sriov-hot-plug-vm1
And VM1 must-gather logs are missing due to early teardown,
which makes it difficult to debug and understand the root
cause of the failure.

VM1 is torndown by exit of Resource class
(which VirtualMachineForTests inherits from).
Context managers unwind as the exception propagates through the
stack, before any outer exception handler (pytest's hook)
can see it.
Currently the same fixture creates the VM
 (via with VirtualMachineForTests(...)) and performs the hot-plug.
When hot-plug fails, Python's exception propagation triggers
VirtualMachineForTests.__exit__ (inherited from Resource),
which deletes the VM before the exception reaches pytest's
pytest_exception_interact hook. As a result, must-gather
runs after the VM is already deleted and cannot capture its state.

The fix is to use separate fixtures for VM creation and
SR-IOV interface hotplug - then must-gather runs right after
setup error.

Signed-off-by: Asia Khromov <azhivovk@redhat.com>
Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com>
@azhivovk
Copy link
Copy Markdown
Contributor Author

azhivovk commented May 3, 2026

Change: edit commit message

@openshift-virtualization-qe-bot-2
Copy link
Copy Markdown
Contributor

Clean rebase detected — no code changes compared to previous head (2b3033d).

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/network/l2_bridge/test_bridge_nic_hot_plug.py`:
- Around line 419-437: sriov_hot_plug_vm2 and
vm2_with_hot_plugged_sriov_interface duplicate the vm1 pair; refactor by
extracting a reusable factory/helper that creates the hot-plug VM fixture and a
corresponding helper to attach the SR-IOV interface so both vm1 and vm2 fixtures
call the same functions with different parameters (VM name suffix, consumed
network fixture and index_number seed); update the fixtures sriov_hot_plug_vm2
and vm2_with_hot_plugged_sriov_interface to delegate to that helper (referencing
sriov_hot_plug_vm2, vm1 equivalents, hot_plug_interface_and_set_address) so
duplication is removed while preserving behavior and fixture signatures.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4916816e-bf70-44ef-8eaa-e8eb6c16ee3e

📥 Commits

Reviewing files that changed from the base of the PR and between bc4cb70 and 5173855.

📒 Files selected for processing (2)
  • tests/network/l2_bridge/libl2bridge.py
  • tests/network/l2_bridge/test_bridge_nic_hot_plug.py
💤 Files with no reviewable changes (1)
  • tests/network/l2_bridge/libl2bridge.py

Comment on lines +419 to +437
def sriov_hot_plug_vm2(namespace, unprivileged_client):
with create_vm_for_hot_plug(
namespace_name=namespace.name,
vm_name=f"{SRIOV}-{HOT_PLUG_STR}-vm2",
sriov_network_for_hot_plug=sriov_network_for_hot_plug,
ipv4_address=random_ipv4_address(net_seed=0, host_address=next(index_number)),
client=unprivileged_client,
) as vm:
yield vm


@pytest.fixture()
def vm2_with_hot_plugged_sriov_interface(sriov_hot_plug_vm2, sriov_network_for_hot_plug, namespace, index_number):
hot_plug_interface_and_set_address(
vm=sriov_hot_plug_vm2,
hot_plugged_interface_name=sriov_network_for_hot_plug.name,
net_attach_def_name=f"{namespace.name}/{sriov_network_for_hot_plug.name}",
ipv4_address=random_ipv4_address(net_seed=0, host_address=next(index_number)),
sriov=True,
)
return sriov_hot_plug_vm2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Optional: sriov_hot_plug_vm2 / vm2_with_hot_plugged_sriov_interface are near-identical to the vm1 pair.

The only differences between the vm1 and vm2 pairs are the VM-name suffix and which fixture they consume. A shared factory helper could reduce duplication, e.g.:

💡 Optional DRY refactor
+def _make_sriov_hot_plug_vm(namespace, unprivileged_client, suffix):
+    with create_vm_for_hot_plug(
+        namespace_name=namespace.name,
+        vm_name=f"{SRIOV}-{HOT_PLUG_STR}-{suffix}",
+        client=unprivileged_client,
+    ) as vm:
+        yield vm
+
+
 `@pytest.fixture`()
-def sriov_hot_plug_vm1(namespace, unprivileged_client):
-    with create_vm_for_hot_plug(
-        namespace_name=namespace.name,
-        vm_name=f"{SRIOV}-{HOT_PLUG_STR}-vm1",
-        client=unprivileged_client,
-    ) as vm:
-        yield vm
+def sriov_hot_plug_vm1(namespace, unprivileged_client):
+    yield from _make_sriov_hot_plug_vm(namespace, unprivileged_client, suffix="vm1")
 
 
 `@pytest.fixture`()
-def sriov_hot_plug_vm2(namespace, unprivileged_client):
-    with create_vm_for_hot_plug(
-        namespace_name=namespace.name,
-        vm_name=f"{SRIOV}-{HOT_PLUG_STR}-vm2",
-        client=unprivileged_client,
-    ) as vm:
-        yield vm
+def sriov_hot_plug_vm2(namespace, unprivileged_client):
+    yield from _make_sriov_hot_plug_vm(namespace, unprivileged_client, suffix="vm2")

The same pattern applies to vm1_with_hot_plugged_sriov_interface / vm2_with_hot_plugged_sriov_interface.

Given the PR's primary goal is explicit fixture separation for diagnostics, keeping them separate is also a defensible choice. Treat this as purely optional.

🧰 Tools
🪛 Ruff (0.15.12)

[warning] 428-428: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/network/l2_bridge/test_bridge_nic_hot_plug.py` around lines 419 - 437,
sriov_hot_plug_vm2 and vm2_with_hot_plugged_sriov_interface duplicate the vm1
pair; refactor by extracting a reusable factory/helper that creates the hot-plug
VM fixture and a corresponding helper to attach the SR-IOV interface so both vm1
and vm2 fixtures call the same functions with different parameters (VM name
suffix, consumed network fixture and index_number seed); update the fixtures
sriov_hot_plug_vm2 and vm2_with_hot_plugged_sriov_interface to delegate to that
helper (referencing sriov_hot_plug_vm2, vm1 equivalents,
hot_plug_interface_and_set_address) so duplication is removed while preserving
behavior and fixture signatures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants