Skip to content

Feature/make article retrieval great again#151

Merged
bernomone merged 2 commits into
mainfrom
feature/make-article-retrieval-great-again
Jan 24, 2026
Merged

Feature/make article retrieval great again#151
bernomone merged 2 commits into
mainfrom
feature/make-article-retrieval-great-again

Conversation

@bernomone
Copy link
Copy Markdown
Collaborator

  • Add argparse to evals so that we can select only some models/evals to run
  • Make article.py an agent again
  • Fix a test with a non-existent article and made it "fuzzy" (it is enough if the title has some keywords)

Article evals passed:

Models: claude-aws-bedrock
Evals: article

===== Model: claude-aws-bedrock =====

Running article evals...
Evaluating case: Tell me about paper 'Entity Embeddings of Categorical Variables'
Evaluating case: What is paper 'The deterministic Kermack-McKendrick model bounds the general stochastic epidemic' about?
Evaluating case: Tell me about paper 'https://arxiv.org/pdf/1604.06737'
Evaluating case: What is paper https://arxiv.org/pdf/1602.01730 about?
Evaluating case: Find this paper 'Quark Gluon plasma and AI'
Total cases: 5
✅ Passed: 5

Copy link
Copy Markdown
Owner

@martinapugliese martinapugliese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grande Berna

@bernomone bernomone merged commit fd6dd3e into main Jan 24, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants