Skip to content

Add optimizer backend selection to embed finetune recipe#211

Draft
oliverholworthy wants to merge 3 commits into
mainfrom
oholworthy/embed-optimizer-backends
Draft

Add optimizer backend selection to embed finetune recipe#211
oliverholworthy wants to merge 3 commits into
mainfrom
oholworthy/embed-optimizer-backends

Conversation

@oliverholworthy
Copy link
Copy Markdown
Contributor

Adds optimizer backend selection to the embed stage2 finetune recipe.

  • Switches embed finetune Docker/Slurm defaults to nvcr.io/nvidia/nemo-automodel:26.04
  • Updates stage2 finetune to nemo-automodel==0.4.0
  • Defaults the Automodel base config to Transformer Engine FusedAdam
  • Adds optimizer_backend: auto | fused_adam | flash_adamw
  • Adds FlashOptim FlashAdamW fallback support for local environments without Transformer Engine
  • Aligns the bi-encoder base config with the latest Automodel retrieval example
  • Adds trust_remote_code support for the default nvidia/llama-nemotron-embed-1b-v2 model

Testing

  • uv lock --check --project src/nemotron/recipes/embed/stage2_finetune
  • uv run ruff check src/nemotron/recipes/embed/stage2_finetune/train.py tests/recipes/embed/test_finetune_optimizer_backends.py
  • uv run pytest tests/recipes/embed/test_finetune_optimizer_backends.py tests/recipes/embed/test_config_models.py::TestFinetuneConfigValidation tests/recipes/embed/test_yaml_config_compat.py -q
  • uv run --extra data-sdg nemotron embed finetune --help
  • uv run --extra data-sdg nemotron embed finetune -c default --dry-run

Signed-off-by: Oliver Holworthy <1216955+oliverholworthy@users.noreply.github.com>
Signed-off-by: Oliver Holworthy <1216955+oliverholworthy@users.noreply.github.com>
Signed-off-by: Oliver Holworthy <1216955+oliverholworthy@users.noreply.github.com>
@oliverholworthy oliverholworthy self-assigned this May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant