Skip to content

docs(inference): clarify local inference routing#190

Merged
miyoungc merged 4 commits intomainfrom
pmlocek/inference-docs-pr
Mar 10, 2026
Merged

docs(inference): clarify local inference routing#190
miyoungc merged 4 commits intomainfrom
pmlocek/inference-docs-pr

Conversation

@pimlock
Copy link
Collaborator

@pimlock pimlock commented Mar 10, 2026

Summary

  • update the inference docs to distinguish external inference governed by network policies from the special inference.local endpoint
  • document inference.local as the local-privacy path that routes to the gateway-configured model and rename the configure page to configure
  • refresh the CLI, policy schema, and OpenCode tutorial to match the current inference configuration flow

Verification

  • uv run sphinx-build -b html docs _build/docs
  • mise run pre-commit (repo-wide checks ran; existing unrelated clippy warnings remain)

$ nemoclaw provider create --name nvidia-prod --type nvidia --from-existing
```

You can also use `openai` or `anthropic` providers.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: suggest writing "provider types."

Or switch providers without repeating the current model manually:

```console
$ nemoclaw inference update --provider openai-prod
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to update the provider without updating the model?

client = OpenAI(base_url="https://inference.local/v1", api_key="dummy")

response = client.chat.completions.create(
model="anything",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we recommend users pass here? Can model be omitted?

@miyoungc miyoungc added the area:docs Documentation and examples label Mar 10, 2026
@miyoungc miyoungc added this to the gtc milestone Mar 10, 2026
@miyoungc miyoungc merged commit 544fc41 into main Mar 10, 2026
16 checks passed
@miyoungc miyoungc deleted the pmlocek/inference-docs-pr branch March 10, 2026 04:34
drew pushed a commit that referenced this pull request Mar 16, 2026
* docs(inference): clarify local inference routing

* docs(inference): update provider and model examples

* fix doc build

---------

Co-authored-by: Miyoung Choi <miyoungc@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:docs Documentation and examples

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants