The orchestration layer is heading in the right direction, but the current source classification is still pretty narrow and URL-heuristic heavy. That works for the examples we already have, but it’s going to look flimsy once search gets broader.
I’d like the worker/orchestrator to get smarter about what counts as strong evidence, what is basically low-value filler, and when two URLs are really saying the same thing. Official docs and API pages should still win, but we probably need better domain handling, duplicate collapsing, and less brittle classification than a handful of hardcoded URL checks.
Not trying to build a giant scoring engine here. Just enough improvement that web_explore stops feeling overly dependent on lucky result ordering.
The orchestration layer is heading in the right direction, but the current source classification is still pretty narrow and URL-heuristic heavy. That works for the examples we already have, but it’s going to look flimsy once search gets broader.
I’d like the worker/orchestrator to get smarter about what counts as strong evidence, what is basically low-value filler, and when two URLs are really saying the same thing. Official docs and API pages should still win, but we probably need better domain handling, duplicate collapsing, and less brittle classification than a handful of hardcoded URL checks.
Not trying to build a giant scoring engine here. Just enough improvement that web_explore stops feeling overly dependent on lucky result ordering.