Skip to content

fix: bypass Sedna network timeouts during local offline benchmarking#552

Open
karyxx wants to merge 2 commits into
kubeedge:mainfrom
karyxx:feature/bypass-network-timeouts
Open

fix: bypass Sedna network timeouts during local offline benchmarking#552
karyxx wants to merge 2 commits into
kubeedge:mainfrom
karyxx:feature/bypass-network-timeouts

Conversation

@karyxx

@karyxx karyxx commented Jun 15, 2026

Copy link
Copy Markdown

What type of PR is this?
/kind bug
/kind feature

What this PR does / why we need it:
This PR introduces a global network bypass patch for the Sedna client inside benchmarking.py and core/cmd/benchmarking.py.

Why is it needed?

When running Ianvs benchmarks locally, there is no active KubeEdge Cloud controller or Knowledge Base server. Despite this, the Sedna backend client attempts REST calls to remote/null endpoints. Because it is decorated with tenacity retry logic, it retries failed attempts up to 5 times with 3-second waits (15 seconds total per call).

Across multiple incremental seen/unseen training and evaluation rounds, these network timeouts block execution, wasting up to 15–20 minutes of idle wait time per benchmarking job.

What does this patch do?

  1. Intercepts the module-level sedna.service.client.http_request function at the Ianvs entry points.
  2. Checks if the destination endpoint is local/null (None/, 127.0.0.1, localhost).
  3. If it is local and the toggle is enabled, it instantly raises a ConnectionError("Connection refused.") to bypass the 15-second retry loops. This allows Sedna's internal try/except blocks to run and gracefully fall back to local offline mechanisms without any delay.
  4. Provides a toggle BYPASS_TIMEOUTS (default: "1" / enabled). If set to "0", the bypass is disabled.

How to test it?

To run benchmarks normally with the bypass enabled:

ianvs -f examples/robot/lifelong_learning_bench/semantic-segmentation/benchmarkingjob-simple.yaml

To run benchmarks with the bypass disabled (original timeout behavior):

BYPASS_TIMEOUTS=0 ianvs -f examples/robot/lifelong_learning_bench/semantic-segmentation/benchmarkingjob-simple.yaml

Fixes #494

Signed-off-by: karyxx <pulkit.kr1924@gmail.com>
@kubeedge-bot kubeedge-bot added kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. labels Jun 15, 2026
@kubeedge-bot

Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: karyxx
To complete the pull request process, please assign moorezheng after the PR has been reviewed.
You can assign the PR to them by writing /assign @moorezheng in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubeedge-bot

Copy link
Copy Markdown
Collaborator

Welcome @karyxx! It looks like this is your first PR to kubeedge/ianvs 🎉

@kubeedge-bot kubeedge-bot requested review from Poorunga and hsj576 June 15, 2026 17:39
@kubeedge-bot kubeedge-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 15, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to bypass local or invalid HTTP endpoints to speed up offline runs by patching Sedna's HTTP request client. However, this patching logic is duplicated across benchmarking.py and core/cmd/benchmarking.py. Feedback recommends extracting this logic into a shared helper function in core/common/utils.py to adhere to DRY principles. Additionally, the current implementation logs that the bypass is active even when it is disabled, and it incurs unnecessary performance overhead by querying the environment variable on every single HTTP request.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread benchmarking.py Outdated
Comment thread core/cmd/benchmarking.py Outdated
Signed-off-by: karyxx <pulkit.kr1924@gmail.com>
@kubeedge-bot kubeedge-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 15, 2026
@karyxx

karyxx commented Jun 15, 2026

Copy link
Copy Markdown
Author

I have moved the logic inside core/common/utils.py, resolved the performance overhead and misleading log message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Local benchmarks experience massive idle delays due to unmocked Sedna HTTP retries

2 participants