feat: Implement run_moderation for all safety providers with NotImplementedError #4662
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Closes #4605 by implementing a reusable mixin that provides
run_moderationfunctionality for the safety providers that lackrun_moderationby delegating to their existingrun_shieldimplementations. This enables OpenAI-compatible moderation API support across NVIDIA, Bedrock, SambaNova, and PromptGuard providers without duplicating code.Test Plan
Added unit tests in
tests/unit/providers/utils/test_safety_mixin.pyTested against a llama stack server with the following configuration
Example requests
run-shieldwith unblocked input{ "violation": null }moderationswith unblocked input{ "id": "modr-f5ba0dc7-d5a6-47ce-b1cf-e0879f171183", "model": "nemo-guardrail", "results": [ { "flagged": false, "categories": {}, "category_applied_input_types": {}, "category_scores": {}, "user_message": null, "metadata": {} } ] }run-shieldwith blocked input{ "violation": { "violation_level": "error", "user_message": "Sorry I cannot do this.", "metadata": { "check message length": { "status": "success" }, "check forbidden words": { "status": "blocked" } } } }moderationswith blocked input{ "id": "modr-4d0772c0-54a8-4790-bd2c-24feaf6b3b55", "model": "nemo-guardrail", "results": [ { "flagged": true, "categories": { "unsafe": true }, "category_applied_input_types": { "unsafe": [ "text" ] }, "category_scores": { "unsafe": 1.0 }, "user_message": "Sorry I cannot do this.", "metadata": { "check message length": { "status": "success" }, "check forbidden words": { "status": "blocked" } } } ] }the local nemo server was started with
nemoguardrails server --config configs --port 8001 --default-config-id nemouisng this configuration from this branch; this assumes having access to a llm with openai compatible endpointscc @nathan-weinberg @cdoern @raghotham @leseb @franciscojavierarceo