Skip to content

[daily-ai-review] Suspicious zero-relevance classification with high-signal articles #151

@github-actions

Description

@github-actions

This run classified 70 unseen articles but found 0 relevant items, despite several articles with highly relevant titles and content.

Evidence of likely false negatives:

  • "שדדו 35 אלף שקלים מקורבן זנות בת"א" (Robbed 35K shekels from prostitution victim in Tel Aviv)
  • "בתור אישה מקבלים יותר": גברים במעגל הזנות" ("As a woman you get more": men in prostitution circles)
  • "עלייה במאות אחוזים באכיפת החוק לאיסור הזנות בישראל" (Hundreds of percent increase in prostitution law enforcement)
  • "יבוצעו השלמות חקירה בפרשת "משחקי חברה" שבה אייל גולן נחשד בעבירות מין" (Investigation supplements in "social games" case where Eyal Golan suspected of sex crimes)

Pattern:
All 70 articles were classified as "not_relevant" with "high" confidence, including articles that appear directly relevant to prostitution, trafficking, and enforcement.

Next steps:

  1. Review classifier prompt/rules for overly restrictive criteria
  2. Manually verify classification of the articles listed above
  3. Check if classifier is incorrectly filtering out Hebrew content or certain article types

Review context

  • Run timestamp: 2026-05-17T06:47:46.645956+00:00

  • Run snapshot: state_repo/news_items/ingest/runs/2026-05-17T06-47-46-645956Z.json

  • Debug summary: state_repo/news_items/ingest/logs/2026-05-17T06-47-46-645956Z.summary.json

  • Debug log: state_repo/news_items/ingest/logs/2026-05-17T06-47-46-645956Z.json

  • Workflow run: https://github.com/DataHackIL/tfht_enforce_idx/actions/runs/25983853246

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions