fix(tests): repair default test failures by smithfabian · Pull Request #942 · jundot/omlx

smithfabian · 2026-04-24T11:14:36Z

Fixes a few test found from a fresh fork + clone setup. The fixes make the current default test suite behave consistently for new contributors.

The failing tests were:

tests/test_accuracy_benchmark.py::TestAccuracyBenchmarkRequest::test_all_valid_benchmarks
tests/test_admin_api_key.py::TestListModelsSettings::test_list_models_includes_all_model_settings_fields
tests/test_admin_profiles_api.py::test_all_model_settings_fields_classified

Plus two additional failures found while validating the test suite:

tests/integration/test_e2e_streaming.py
tests/test_boundary_snapshot_store.py::TestBoundarySnapshotSSDStore::test_cleanup_all_drains_queue

This also fixes:

A fresh dev install dependency failure where tests using FastAPI with form/file handling can fail during app import.

Changes

Derives the accuracy benchmark expected count using len(VALID_BENCHMARKS) instead of a stale hard-coded integer. Fixes failing test 1.
Include preserve_thinking and turboquant_skip_last in the admin model settings response. Fixes failing test 2.
Classify preserve_thinking as a universal profile field in UNIVERSAL_PROFILE_FIELDS. Fixes failing test 3.
Add MockEnginePool.get_entry() for mocked streaming tests. Fixes validation gap 4.
Fix a BoundarySnapshotSSDStore.cleanup_all() race where cleanup could miss an in-flight background write. Fixes validation gap 5.
Add missing python-multipart dev dependency needed by FastAPI form/file routes in the test suite. Fixes failure 6, for fresh pip install -e ".[dev]".

Test validation

pytest tests/test_accuracy_benchmark.py::TestAccuracyBenchmarkRequest::test_all_valid_benchmarks \
  tests/test_admin_api_key.py::TestListModelsSettings::test_list_models_includes_all_model_settings_fields \
  tests/test_admin_profiles_api.py::test_all_model_settings_fields_classified
# result: 3 passed

pytest tests/integration/test_e2e_streaming.py --override-ini=addopts=
# result: 37 passed

pytest tests/test_boundary_snapshot_store.py::TestBoundarySnapshotSSDStore::test_cleanup_all_drains_queue
# result: 1 passed

pytest
# result: 3837 passed, 18 skipped, 54 deselected

Copilot

Pull request overview

This PR fixes several “fresh clone” test failures and improves default test-suite consistency for new contributors by aligning tests with current constants, completing admin model-settings responses, strengthening mocked streaming infrastructure, addressing a boundary snapshot cleanup race, and ensuring required FastAPI multipart support is present in dev installs.

Changes:

Make accuracy benchmark test expectations derive from VALID_BENCHMARKS instead of a hard-coded count.
Expose missing model settings fields in the admin “list models” response and classify preserve_thinking as a universal profile field.
Fix streaming test mocking gaps, harden BoundarySnapshotSSDStore.cleanup_all() against in-flight writes, and add python-multipart to dev dependencies.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
tests/test_accuracy_benchmark.py	Removes stale hard-coded benchmark count by asserting against `len(VALID_BENCHMARKS)`.
tests/integration/test_e2e_streaming.py	Extends the mock engine pool with `get_entry()` to satisfy server code paths during streaming tests.
pyproject.toml	Adds `python-multipart` to dev dependency sets to prevent FastAPI import-time failures for form/file routes in tests.
omlx/model_profiles.py	Adds `preserve_thinking` to `UNIVERSAL_PROFILE_FIELDS` for correct profile/template field classification.
omlx/cache/boundary_snapshot_store.py	Ensures queue tasks are marked done and uses `join()` to avoid cleanup races with in-flight background writes.
omlx/admin/routes.py	Includes `preserve_thinking` and `turboquant_skip_last` in the admin model settings response payload.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

smithfabian added 2 commits April 24, 2026 11:26

fix(tests): repair default test failures

b08f5ed

test(streaming): use explicit engine entry mock

15c6925

Copilot AI review requested due to automatic review settings April 24, 2026 11:14

Copilot started reviewing on behalf of smithfabian April 24, 2026 11:15 View session

smithfabian changed the title ~~Fix/default test failures~~ fix(tests): repair default test failures Apr 24, 2026

Copilot AI reviewed Apr 24, 2026

View reviewed changes

Keenni mentioned this pull request Apr 24, 2026

[Bug] KeyError in scheduler._schedule_waiting — all models hang with tool calling (v0.3.7/v0.3.8dev1) #944

Open

jundot force-pushed the main branch from 7844f15 to b078330 Compare April 28, 2026 02:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tests): repair default test failures#942

fix(tests): repair default test failures#942
smithfabian wants to merge 2 commits intojundot:mainfrom
smithfabian:fix/default-test-failures

smithfabian commented Apr 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

smithfabian commented Apr 24, 2026

Changes

Test validation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants