Skip to content

test: upgrade integration harnesses for dual-server execution#706

Open
anubhav756 wants to merge 1 commit into
mcp-v202606from
anubhav-test-infra
Open

test: upgrade integration harnesses for dual-server execution#706
anubhav756 wants to merge 1 commit into
mcp-v202606from
anubhav-test-infra

Conversation

@anubhav756

@anubhav756 anubhav756 commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Description

This PR upgrades the integration test harnesses across the SDK (core, adk, langchain, and llamaindex) to support running integration tests against multiple Toolbox server configurations simultaneously.

By parameterizing the toolbox_server_url fixture and updating the Cloud Build configurations, the CI will now spin up and test against two server instances:

  1. Stable Server (Port 5000): Runs stable server.
  2. Draft-Enabled Server (Port 5001): Runs with the draft specs enabled.

Motivation

This is a pure testing infrastructure change. It lays the groundwork for the upcoming DRAFT-2026-v1 protocol support, ensuring we can test backward compatibility (graceful protocol downgrading) against older servers without breaking any existing test cases.

Note

The presubmits fail because in this PR, we configured the test suite to run against both stable (5000) and draft (5001). It fails because it successfully negotiated the draft spec. This is fixed in the #703 branch (which is stacked on top of this PR).

Note

This PR focusses only on parameterizing the tests. The fallback logic is broken at this stage which caused crashes against the stable server, so we added the if "DRAFT" not in v filter there temporarily. However, in the upcoming PR #703 that workaround is removed.

@anubhav756 anubhav756 requested a review from a team as a code owner July 1, 2026 14:59
Base automatically changed from anubhav-sep-2243 to mcp-v202606 July 2, 2026 05:07
try:
print("Opening toolbox server process...")
# Make toolbox executable
os.chmod("toolbox", 0o700)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work for windows? Since we're adding windows support to the get_toolbox_binary_url method, should these commands be fixed everywhere?

Currently I am also okay with removing the windows support since our CI runs on linux.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it come under just testing infra change? Should it be moved to another PR?

return request.param


@pytest.fixture(autouse=True)

@twishabansal twishabansal Jul 2, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: autouse pulls in the parametrized toolbox_server_url, so the ×2 matrix hits every test under the root tests/conftest.py: including mocked unit tests. adk avoids this by scoping the fixture to tests/integration/conftest.py. Can we limit the matrix to the e2e tests (non-autouse, or a marker) instead of session-wide autouse?

For core, langchain, llamaindex

"pytest-aioresponses==0.3.0",
"pytest-asyncio==1.4.0",
"pytest-cov==7.1.0",
"numpy<2.2.0",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment explaining why numpy<2.2.0 is needed? Unexplained upper bounds tend to rot nobody knows when it's safe to lift, and since caps propagate, the moment another test dep requires numpy>=2.2 the resolver can't satisfy both and silently backtracks to old versions. A one-line note would keep this from becoming a mystery pin later.

@anubhav756 anubhav756 force-pushed the anubhav-test-infra branch from 994fbdf to f81589a Compare July 2, 2026 06:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants