You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
citation_verification.verify_citations() is currently binary: a citation in the LLM's report either matches a URL in the source registry (verified) or is silently stripped from the report (url_not_in_registry). Two failure modes follow from that:
Silent drops on edge cases. When the registry's URL extraction or the LLM's URL formatting differs in any small way — embedded commas in paths, tracking params, fragment differences, casing — the citation disappears with no signal to the user. The recent ,-truncation regression (fix: keep commas inside URL paths in citation source extractor #209) is one symptom; URL normalization differences are another.
No signal to the UI. The frontend has no way to render uncertain citations with a confidence badge or a "weakly verified" indicator. The user sees either a clean numbered citation or no citation at all, with the supporting source still available in the data sources panel.
Proposal
Move the verifier from binary to scored:
Each citation gets a verification record: {verified: bool, confidence: float, match_kind: 'exact' | 'normalized' | 'host-only' | 'unmatched', diagnostics: ...}.
Citations below a threshold are still passed through to the response with their score, not removed. The UI decides how to render based on confidence.
The verifier emits structured telemetry (count and reason for each removal/downgrade), so we can monitor false-positive removals over time. (Today there's no metric for this; the only signal is user reports.)
Add a confidence badge in the UI's citation card (frontends/ui/src/features/layout/components/CitationCard.tsx) that surfaces the score.
Background
citation_verification.verify_citations()is currently binary: a citation in the LLM's report either matches a URL in the source registry (verified) or is silently stripped from the report (url_not_in_registry). Two failure modes follow from that:,-truncation regression (fix: keep commas inside URL paths in citation source extractor #209) is one symptom; URL normalization differences are another.Proposal
Move the verifier from binary to scored:
{verified: bool, confidence: float, match_kind: 'exact' | 'normalized' | 'host-only' | 'unmatched', diagnostics: ...}.frontends/ui/src/features/layout/components/CitationCard.tsx) that surfaces the score.Out of scope for this issue
References
src/aiq_agent/common/citation_verification.py—verify_citations(),_normalize_url(),SourceRegistry.resolve_url().frontends/ui/src/features/layout/components/CitationCard.tsx.