Skip to content

Add AI research tool evaluation harness#158

Open
zhengjynicolas wants to merge 1 commit into
SCIBASE-AI:mainfrom
zhengjynicolas:codex/research-tool-eval-harness-13
Open

Add AI research tool evaluation harness#158
zhengjynicolas wants to merge 1 commit into
SCIBASE-AI:mainfrom
zhengjynicolas:codex/research-tool-eval-harness-13

Conversation

@zhengjynicolas
Copy link
Copy Markdown

Summary

  • add a dependency-free evaluation harness for AI-assisted research tool outputs
  • validate summary mode contracts, evidence grounding, peer-review diagnostics, compliance checks, citation coverage, confidence, DOI metadata, and APA/Nature formatting
  • include requirement mapping, local tests, CLI demo, and a short H.264 terminal walkthrough video

Demo video

https://github.com/zhengjynicolas/SCIBASE.AI/blob/codex/research-tool-eval-harness-13/ai-research-tool-evaluation-harness/docs/research-tool-evaluation-demo.mp4

Validation

  • npm run check
  • npm test
  • npm run demo
  • git diff --check
  • ffprobe verified demo video: h264, 1280x720, 20.37s

/claim #13

AI-assisted with OpenAI Codex; I reviewed and locally verified the diff before submission.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant