Skip to content

Build your own benchmark feature integration #174

Open
anushaknvidia wants to merge 7 commits into
NVIDIA-NeMo:agenticfrom
anushaknvidia:byob
Open

Build your own benchmark feature integration #174
anushaknvidia wants to merge 7 commits into
NVIDIA-NeMo:agenticfrom
anushaknvidia:byob

Conversation

@anushaknvidia
Copy link
Copy Markdown

No description provided.

@anushaknvidia anushaknvidia marked this pull request as ready for review April 29, 2026 08:51
@anushaknvidia anushaknvidia changed the title Bring your own benchmark feature integration Build your own benchmark feature integration Apr 29, 2026
|---------|-------------|
| [RAG Agent with Nemotron RAG Models](RAG Agent with Nemotron RAG Models/README.md) | End-to-end example of a Retrieval-Augmented Generation (RAG) agent workflow using Nemotron RAG models through Hugging Face and Nemotron 9B hosted through build.nvidia.com models |
| [Data Science ML Agent](Data Science ML Agent/README.md) | End-to-end example of a natural language-driven data science and machine learning agent powered by NVIDIA GPUs. The agent allows users to perform data exploration, model training, and hyperparameter optimization interactively using RAPIDS cuDF and cuML for GPU acceleration.|
| [Build Your Own Benchmark (BYOB)](build-your-own-benchmark/build_mcq_benchmark.ipynb) | End-to-end **Build Your Own Benchmark** flow for MMLU-style multiple-choice: prepare seed data, run a staged Data Designer pipeline (draft, judge, dedupe, distractor checks), and export a `benchmark.parquet`. See `build-your-own-benchmark/download_wikipedia_data.ipynb` to build source text from Wikipedia. |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compilation fails with invalid nested f-strings.
Please add test cases for the byob pipeline.
There is no readme for the review.
Dependencies are not put in pyproject.toml and uv.lock
Please run Nemotron's pre-commit hooks
What is assets folder in the usecase README?
Please take a look at the translation branch for a checklist of agentic skills.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants