Skip to content

Refs #406: Reject control chars in public search queries#572

Open
xingxi0614-cpu wants to merge 3 commits into
ramimbo:mainfrom
xingxi0614-cpu:codex/b406-bounty-search-control-query
Open

Refs #406: Reject control chars in public search queries#572
xingxi0614-cpu wants to merge 3 commits into
ramimbo:mainfrom
xingxi0614-cpu:codex/b406-bounty-search-control-query

Conversation

@xingxi0614-cpu
Copy link
Copy Markdown

@xingxi0614-cpu xingxi0614-cpu commented May 28, 2026

Summary

  • reject control characters in public q search parameters before they are reflected or used for filtering
  • apply the validation to bounty list/summary search, accepted-work activity search, and MCP list_bounties
  • add regression coverage for %00 so control-character searches fail closed with bounded 400 responses on API/HTML routes and invalid-argument JSON-RPC errors on MCP

Evidence

Live public preflight before this fix showed control-character searches were accepted on public search endpoints:

  • GET https://api.mrwk.ltclab.site/api/v1/bounties?q=%00 -> HTTP 200 and returned current bounty rows
  • GET https://api.mrwk.ltclab.site/api/v1/bounties/summary?q=%00 -> HTTP 200 with bounties_shown=77
  • GET https://api.mrwk.ltclab.site/api/v1/activity?q=%00 -> HTTP 200 with query:\u0000

That is misleading because non-empty search parameters should not silently widen to all bounty rows or reflect control characters in public JSON. The fix now rejects C0/DEL control characters with bounded 400 responses.

Validation

  • uv run --extra dev python -m pytest tests/test_activity.py tests/test_bounty_api_routes.py tests/test_bounty_api.py tests/test_bounty_pages.py tests/test_api_mcp.py::test_mcp_list_bounties_filters_status_query_and_limit tests/test_api_mcp.py::test_mcp_list_bounties_rejects_invalid_filters -q -> 40 passed
  • uv run --extra dev python -m pytest -q -> 429 passed
  • uv run --extra dev ruff check app/activity.py app/bounty_api.py app/mcp_tools.py tests/test_activity.py tests/test_api_mcp.py tests/test_bounty_api_routes.py tests/test_bounty_pages.py -> passed
  • uv run --extra dev ruff format --check app/activity.py app/bounty_api.py app/mcp_tools.py tests/test_activity.py tests/test_api_mcp.py tests/test_bounty_api_routes.py tests/test_bounty_pages.py -> 7 files already formatted
  • uv run --extra dev python -m mypy app -> success
  • uv run --extra dev python scripts/docs_smoke.py -> docs smoke ok
  • git diff --check HEAD~1..HEAD -> clean

Safety

No private data, cookies, tokens, wallet material, signatures, production mutation, price claims, liquidity claims, exchange claims, bridge promises, off-ramp promises, or private security details are included.

Summary by CodeRabbit

  • Bug Fixes
    • Search inputs are now consistently validated and normalized across activity, bounty, and tool-backed listing flows: control characters are rejected with HTTP 400, queries are trimmed, and empty queries are treated as no filter.
  • Tests
    • Added tests verifying rejection of control-character search queries for activity endpoints, bounty APIs/pages, and the MCP list-bounties tool.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: fd9bc8ac-e927-48d8-a643-1ff1c926d22b

📥 Commits

Reviewing files that changed from the base of the PR and between 50d4daa and 1165d5f.

📒 Files selected for processing (1)
  • tests/test_bounty_api_routes.py

📝 Walkthrough

Walkthrough

Three search endpoints now validate and normalize the q query parameter by stripping whitespace and rejecting control characters (ASCII < 32 or == 127). Activity, bounty API, and MCP tool routes each define a validation helper and return HTTP 400 or tool error when control characters are detected.

Changes

Control Character Validation Across Search Endpoints

Layer / File(s) Summary
Activity endpoint query validation
app/activity.py, tests/test_activity.py
New _normalized_activity_search_query helper validates the q parameter and rejects control characters by raising HTTPException(400). FastAPI imports expanded. Test verifies both /api/v1/activity and /activity page return 400 with error detail "q must not contain control characters".
Bounty API query validation
app/bounty_api.py, tests/test_bounty_api_routes.py, tests/test_bounty_pages.py
New _normalized_bounty_search_query helper normalizes whitespace and rejects control characters for both list and summary endpoints. Refactored search logic uses normalized query before building escaped LIKE patterns and optional issue_number predicate. Tests verify API and page endpoints return 400 for control-character q.
MCP tool query validation
app/mcp_tools.py, tests/test_api_mcp.py
New reject_control_chars helper in call_mcp_tool validates string arguments. list_bounties tool applies validation to optional q parameter before escaping and filtering. Parametrized test cases added for control-character inputs in invalid-filters scenarios.

Possibly related PRs

  • ramimbo/mergework#286: Overlaps at the list_bounties MCP tool's q argument processing; prior PR added q filtering while this PR adds control-character validation to the same parameter.
  • ramimbo/mergework#486: Related changes to /activity search handling; this PR adds validation/normalization while the other adjusts downstream activity matching for #<number> issue references.
🚥 Pre-merge checks | ✅ 6
✅ Passed checks (6 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Refs #406: Reject control chars in public search queries' is concrete, names the changed surface (control character rejection in search), and directly describes the main change across multiple files.
Description check ✅ Passed The description covers all key template sections: summary of changes, evidence of the issue with examples, validation results from multiple test suites and linting tools, and safety confirmation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Mergework Public Artifact Hygiene ✅ Passed PR contains no investment claims, price claims, cash-out claims, fabricated payout claims, or private security details. Changes focus solely on control-character input validation for search queries.
Bounty Pr Focus ✅ Passed All stated changes verified in activity.py, bounty_api.py, and mcp_tools.py. Validation added to three search endpoints with comprehensive test coverage for control character rejection.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

Copy link
Copy Markdown

@eliasx45 eliasx45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed current head 82b6fabe330944c01ab99419cb3b48395b82c4bb as a non-author.

No blocker found. The patch is scoped to rejecting C0/DEL control characters in public search inputs before they are reflected or used for filtering, across bounty list/summary, activity API/page, and MCP list_bounties. The existing empty/whitespace search behavior is preserved, while embedded control characters now fail closed with HTTP 400 or the MCP invalid-argument path.

Validation performed:

  • git fetch origin main; PR diff confirmed focused to 7 files
  • git diff --check origin/main...HEAD clean
  • Focused regression set: 40 passed
  • python -m ruff check app/activity.py app/bounty_api.py app/mcp_tools.py tests/test_activity.py tests/test_api_mcp.py tests/test_bounty_api_routes.py tests/test_bounty_pages.py
  • python -m ruff format --check app/activity.py app/bounty_api.py app/mcp_tools.py tests/test_activity.py tests/test_api_mcp.py tests/test_bounty_api_routes.py tests/test_bounty_pages.py
  • python -m mypy app -> success
  • python scripts/docs_smoke.py -> docs smoke ok
  • Full suite: python -m pytest -q -> 429 passed
  • Hosted Quality/readiness/docs/image checks are passing

No private data, credentials, wallet material, signatures, production mutation, MRWK price/off-ramp, exchange/liquidity claims, bridge claims, or fabricated payout claims were used.

Copy link
Copy Markdown

@eliasx45 eliasx45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one blocker on current head 82b6fabe330944c01ab99419cb3b48395b82c4bb.

The new control-character validation runs after strip() on each search path, so leading/trailing whitespace control characters are silently removed instead of rejected. That means q=%09 (tab) is accepted as an empty search on the public/API/MCP paths even though the new contract says q must not contain control characters.

Affected paths I reproduced locally:

  • GET /api/v1/bounties?q=%09 -> 200 with the normal bounty list
  • GET /api/v1/bounties/summary?q=%09 -> 200 with summary data
  • GET /bounties?q=%09 -> 200 page render
  • GET /api/v1/activity?q=%09 -> 200 with query: ""
  • GET /activity?q=%09 -> 200 page render
  • MCP list_bounties with { "q": "\t" } -> returns normal bounty JSON

Why it happens:

  • app/bounty_api.py::_normalized_bounty_search_query() calls query_text.strip() before checking for control characters.
  • app/activity.py::_normalized_activity_search_query() does the same.
  • app/mcp_tools.py gets q through optional_clean_str_arg(), which strips before reject_control_chars("q", query_text) is called.

Expected fix: validate the raw query string for control characters before trimming, then apply the existing whitespace normalization. Please add regression coverage for at least a leading/trailing whitespace control character such as q=%09 in REST/page/MCP paths, not only embedded/NUL input.

Validation I ran:

  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 .\.venv\Scripts\python.exe -m pytest tests\test_activity.py::test_activity_rejects_control_character_search_query tests\test_bounty_api_routes.py::test_bounty_api_rejects_control_character_search_queries tests\test_bounty_pages.py::test_bounties_page_and_api_search_by_text_and_issue_number tests\test_api_mcp.py::test_mcp_list_bounties_rejects_invalid_filters -q -> 10 passed
  • ad hoc TestClient/MCP reproduction above -> tab-only queries accepted on all five HTTP paths plus MCP
  • .\.venv\Scripts\python.exe -m ruff check app\activity.py app\bounty_api.py app\mcp_tools.py tests\test_activity.py tests\test_api_mcp.py tests\test_bounty_api_routes.py tests\test_bounty_pages.py -> passed
  • .\.venv\Scripts\python.exe -m ruff format --check app\activity.py app\bounty_api.py app\mcp_tools.py tests\test_activity.py tests\test_api_mcp.py tests\test_bounty_api_routes.py tests\test_bounty_pages.py -> 7 files already formatted
  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 .\.venv\Scripts\python.exe -m mypy app\activity.py app\bounty_api.py app\mcp_tools.py -> success
  • git diff --check origin/main...HEAD -> clean

@ramimbo
Copy link
Copy Markdown
Owner

ramimbo commented May 28, 2026

Holding with mrwk:needs-info. The accepted review found tab/control-character queries are accepted after strip() instead of being rejected. Please validate the raw q value before trimming and add regression coverage for leading/trailing control characters.

Copy link
Copy Markdown

@barnacleagent-svg barnacleagent-svg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verdict: APPROVED

Scope: Adds control character validation to activity search query. Uses _normalized_activity_search_query helper to strip/validate before passing to activity_context.

Checklist:

  • Diff: focused validation, consistent with existing CONTROL_CHAR patterns
  • Tests pass
  • Follows same pattern as prior PRs (#609, #608, etc.)

Conclusion: Clean input validation. Ready to merge.

@xingxi0614-cpu
Copy link
Copy Markdown
Author

Thanks, good catch. I pushed a follow-up that validates the raw q value before trimming on the public search paths, so leading/trailing control characters such as tab/newline are rejected instead of being stripped away.

Follow-up commit: 7893c23

Updated coverage:

  • activity API/page rejects raw q values like \talice and alice\n
  • bounty API/summary/page rejects raw q values like \tControl, Control\n, and \tDocs
  • MCP list_bounties rejects raw q values like \tDocs and Docs\n

Validation:

  • focused raw-control regressions: 12 passed
  • activity + bounty API/page + MCP list-bounties slice: 42 passed
  • full pytest: 431 passed
  • ruff check/format-check: passed
  • mypy on touched app modules: success
  • docs smoke: ok
  • git diff --check: clean

No private data, secrets, wallet material, production mutation, price/off-ramp, liquidity, exchange, bridge-promise, or private security details are included.

Copy link
Copy Markdown

@eliasx45 eliasx45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed follow-up head 7893c231f08093e460f2ef5402cd8e03fb5d555e.

Verdict: approve implementation, with one merge-readiness caveat.

The raw-control-character blocker from my previous review is addressed. The follow-up now validates raw q before trimming on the HTTP search paths and validates raw MCP string input before accepting list_bounties filters. The added/updated tests cover leading/trailing control characters such as tab/newline as requested.

Validation:

  • git diff --check origin/main...HEAD -> clean.
  • Focused regression slice: PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ..\mergework\.venv\Scripts\python.exe -m pytest tests\test_activity.py::test_activity_rejects_control_character_search_query tests\test_bounty_api_routes.py::test_bounty_api_rejects_control_character_search_queries tests\test_bounty_pages.py::test_bounties_page_and_api_search_by_text_and_issue_number tests\test_api_mcp.py::test_mcp_list_bounties_rejects_invalid_filters -q --tb=short -> 12 passed.
  • ..\mergework\.venv\Scripts\python.exe -m ruff check app\activity.py app\bounty_api.py app\mcp_tools.py tests\test_activity.py tests\test_api_mcp.py tests\test_bounty_api_routes.py tests\test_bounty_pages.py -> passed.
  • ..\mergework\.venv\Scripts\python.exe -m ruff format --check ... -> 7 files already formatted.
  • ..\mergework\.venv\Scripts\python.exe -m mypy app\activity.py app\bounty_api.py app\mcp_tools.py -> success.
  • GitHub checks currently shown green/skipped as non-blocking.

Merge-readiness caveat: this branch is currently reported as CONFLICTING, and local git merge-tree --write-tree origin/main HEAD reports a content conflict in app/bounty_api.py. Please resolve/rebase before maintainer merge. The code issue I previously raised is fixed.

@xingxi0614-cpu xingxi0614-cpu force-pushed the codex/b406-bounty-search-control-query branch from 7893c23 to 50d4daa Compare May 29, 2026 07:26
@xingxi0614-cpu
Copy link
Copy Markdown
Author

Rebased this branch onto the latest main and resolved the app/bounty_api.py conflict.

Current head: 50d4daa

What I kept during the conflict resolution:

  • retained the latest upstream bounty/proposal error handling;
  • kept raw q validation before trimming, so leading/trailing control characters are still rejected;
  • preserved the existing normalized search behavior for clean queries.

Validation after rebase:

  • focused raw-control regression slice: 12 passed
  • ruff check on touched app/test files: passed
  • ruff format-check on touched app/test files: passed
  • mypy app/activity.py app/bounty_api.py app/mcp_tools.py: success
  • git diff --check origin/main...HEAD: clean
  • git merge-tree --write-tree origin/main HEAD: clean

GitHub checks have restarted on the rebased head. No private data, secrets, wallet material, production mutation, price/off-ramp, liquidity, exchange, bridge-promise, or private security details are included.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: ce877525-9535-47a7-9376-713805290094

📥 Commits

Reviewing files that changed from the base of the PR and between 7893c23 and 50d4daa.

📒 Files selected for processing (7)
  • app/activity.py
  • app/bounty_api.py
  • app/mcp_tools.py
  • tests/test_activity.py
  • tests/test_api_mcp.py
  • tests/test_bounty_api_routes.py
  • tests/test_bounty_pages.py

Comment thread tests/test_bounty_api_routes.py
Copy link
Copy Markdown

@eliasx45 eliasx45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed rebased head 50d4daab3b365083bafe8316c26fb4d59e504d31.

Verdict: approve.

The previous merge-conflict caveat is cleared after rebasing onto current main. I verified the raw-control-character fix still validates before trimming across the HTTP search paths and MCP list_bounties, and the focused regressions still cover leading/trailing control characters.

Validation on this checkout:

git fetch origin main
git merge-tree --write-tree origin/main refs/remotes/pr/572
# 8594b49eb2531c6a489b301734e51e24fcdde4c5

git diff --check origin/main...refs/remotes/pr/572
# clean

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ..\mergework\.venv\Scripts\python.exe -m pytest tests\test_activity.py::test_activity_rejects_control_character_search_query tests\test_bounty_api_routes.py::test_bounty_api_rejects_control_character_search_queries tests\test_bounty_pages.py::test_bounties_page_and_api_search_by_text_and_issue_number tests\test_api_mcp.py::test_mcp_list_bounties_rejects_invalid_filters -q --tb=short
# 12 passed

..\mergework\.venv\Scripts\python.exe -m ruff check app\activity.py app\bounty_api.py app\mcp_tools.py tests\test_activity.py tests\test_api_mcp.py tests\test_bounty_api_routes.py tests\test_bounty_pages.py
# passed

..\mergework\.venv\Scripts\python.exe -m ruff format --check app\activity.py app\bounty_api.py app\mcp_tools.py tests\test_activity.py tests\test_api_mcp.py tests\test_bounty_api_routes.py tests\test_bounty_pages.py
# 7 files already formatted

..\mergework\.venv\Scripts\python.exe -m mypy app\activity.py app\bounty_api.py app\mcp_tools.py
# success

GitHub checks are green on the rebased head. I do not see a remaining blocker.

@xingxi0614-cpu
Copy link
Copy Markdown
Author

Addressed the CodeRabbit DEL-boundary test suggestion.

Follow-up commit: 1165d5f

What changed:

  • added \x7f/DEL regression coverage for both /api/v1/bounties?q=... and /api/v1/bounties/summary?q=...;
  • kept the change test-only, with no behavior or production-path changes.

Validation:

  • focused bounty DEL/control-char regression: 1 passed
  • ruff check tests/test_bounty_api_routes.py: passed
  • ruff format --check tests/test_bounty_api_routes.py: passed
  • git diff --check: clean

No private data, secrets, wallet material, production mutation, price/off-ramp, liquidity, exchange, bridge-promise, or private security details are included.

Copy link
Copy Markdown

@eliasx45 eliasx45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed latest head 1165d5fc624e0711761ad00a7733f3308264e26c after the DEL-boundary follow-up.

Verdict: approve.

The new commit is test-only and adds \x7f coverage for both /api/v1/bounties?q=... and /api/v1/bounties/summary?q=..., which addresses CodeRabbit's remaining boundary-test suggestion without changing production code.

Validation on this checkout:

git diff --check origin/main...HEAD
# clean

git merge-tree --write-tree origin/main HEAD
# 49cb272575087106434379a0c848918abc8cea57

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ..\mergework\.venv\Scripts\python.exe -m pytest tests\test_bounty_api_routes.py::test_bounty_api_rejects_control_character_search_queries -q --tb=short
# 1 passed

..\mergework\.venv\Scripts\python.exe -m ruff check tests\test_bounty_api_routes.py
# All checks passed

..\mergework\.venv\Scripts\python.exe -m ruff format --check tests\test_bounty_api_routes.py
# 1 file already formatted

No blocker found.

@xingxi0614-cpu
Copy link
Copy Markdown
Author

Thanks again for the review and maintainer cleanup note. I believe the earlier mrwk:needs-info blocker has been addressed now.

Current state:

  • raw q values are validated before trimming on the HTTP search paths and MCP list_bounties;
  • leading/trailing control characters such as tab/newline are covered by regression tests;
  • the DEL-boundary test suggestion was added in follow-up commit 1165d5f;
  • external re-reviews approved the rebased/latest head and reported no remaining blocker;
  • GitHub checks are green and merge state is clean.

Could you please re-check and clear mrwk:needs-info if this now satisfies the requested change?

@xingxi0614-cpu
Copy link
Copy Markdown
Author

Quick follow-up: this PR still has the mrwk:needs-info label, but the raw-query validation blocker appears to be addressed at current head 1165d5f.

Current state:

  • raw q values are validated before trimming on the HTTP search paths and MCP list_bounties;
  • leading/trailing control characters and DEL are covered by regression tests;
  • the rebased/latest head has external approvals with no remaining blocker reported;
  • GitHub checks are green and merge state is clean.

Could you please re-check and clear mrwk:needs-info if this now satisfies the requested change?

@ramimbo ramimbo removed the mrwk:needs-info More information needed label May 29, 2026
@ramimbo
Copy link
Copy Markdown
Owner

ramimbo commented May 29, 2026

Re-checked current head 1165d5f. The raw-control-character search blocker is cleared: raw q is validated before trimming across the HTTP search paths and MCP list_bounties, with tab/newline/DEL coverage. Waiting for a second useful current-head review before merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants