Skip to content

Conversation

@r-bit-rry
Copy link
Contributor

This PR fixes issue #4189 where the NVIDIA safety provider was calling an incorrect API endpoint when communicating with NeMo Guardrails service.

Problem

The NVIDIA safety provider implementation was calling /v1/guardrail/checks, which does not exist in the NeMo Guardrails API. According to the NeMo Guardrails documentation and nvidia docs, the correct endpoint is /v1/chat/completions.

This caused:

  • 500 Internal Server Error when using NVIDIA safety shields
  • Complete failure of safety filtering functionality
  • Inability to use NeMo Guardrails for content moderation

Solution

1. Fixed Endpoint (nvidia.py:144)

Before:

response = await self._guardrails_post(path="/v1/guardrail/checks", data=request_data)

After:

response = await self._guardrails_post(path="/v1/chat/completions", data=request_data)

2. Simplified Request Format (nvidia.py:140-143)

Before:

request_data = {
    "model": self.model,
    "messages": [{"role": message.role, "content": message.content} for message in messages],
    "temperature": self.temperature,
    "top_p": 1,
    "frequency_penalty": 0,
    "presence_penalty": 0,
    "max_tokens": 160,
    "stream": False,
    "guardrails": {
        "config_id": self.config_id,
    },
}

After:

request_data = {
    "config_id": self.config_id,
    "messages": [{"role": message.role, "content": message.content} for message in messages],
}

The simplified format matches the NeMo Guardrails API specification and removes unnecessary inference parameters that were meant for LLM completion, not safety checks.

Testing

Test Results

✅ All 10 tests passing:

Unit Tests (8/8):
  ✓ test_register_shield_with_valid_id
  ✓ test_register_shield_without_id
  ✓ test_run_shield_allowed
  ✓ test_run_shield_blocked
  ✓ test_run_shield_not_found
  ✓ test_run_shield_http_error
  ✓ test_init_nemo_guardrails
  ✓ test_init_nemo_guardrails_invalid_temperature

E2E Tests (2/2) (not part of this PR):
  ✓ test_nvidia_safety_with_correct_endpoint
  ✓ test_nemo_guardrails_api_endpoint_documentation

Manual Verification

The reproduction script demonstrates the fix:

$ python tests/integration/safety/reproduce_issue_4189.py

✓ SUCCESS: Issue #4189 has been FIXED!
  Response received without errors
  Violation detected: True
  Violation level: ViolationLevel.ERROR
  User message: Sorry I cannot do this.

✓ All tests passing!
  - Blocked messages are detected
  - Safe messages are allowed
  - No 500 errors from wrong endpoint

Validation Against NeMo Guardrails API

This fix aligns with the official NeMo Guardrails API specification:

Endpoint: POST /v1/chat/completions

Request Format:

{
  "config_id": "demo-config",
  "messages": [
    {
      "role": "user",
      "content": "Hello!"
    }
  ]
}

Response Format:

{
  "role": "assistant",
  "content": "Response text",
  "status": "allowed|blocked",
  "rails_status": {
    "reason": "...",
    "triggered_rails": [...]
  }
}

Breaking Changes

None. This is a bug fix that makes the implementation work as originally intended.

References

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 20, 2025
@ashwinb
Copy link
Contributor

ashwinb commented Nov 20, 2025

image

where is the mentioned reproduction script? Either this could be included as a link to a Gist, or better still -- the PR summary should perhaps either be hand edited to look more like what a human engineer would write (just do not include menial details which are irrelevant for a change of such small magnitude) or the LLM could be guided with this precise instruction so the generated Summary feels like that.

@r-bit-rry r-bit-rry requested a review from cdoern as a code owner November 23, 2025 11:13
@r-bit-rry
Copy link
Contributor Author

image

where is the mentioned reproduction script? Either this could be included as a link to a Gist, or better still -- the PR summary should perhaps either be hand edited to look more like what a human engineer would write (just do not include menial details which are irrelevant for a change of such small magnitude) or the LLM could be guided with this precise instruction so the generated Summary feels like that.

Thanks for the reply, I went ahead and did a further simulation vs local instances of nemo, and ollama backend to identify more issues with the guardrails calls and discrepancies vs the original nvidia documentation.
I'll add several more changes and the testing scripts via gist (as they require setting up environment with docker and are not part of the general infrastructure of llama-stack)

@r-bit-rry
Copy link
Contributor Author

@ashwinb
The /v1/guardrail/checks endpoint does not exist in the current NeMo Guardrails codebase.

The server API implementation only includes two main endpoints:
Actual Endpoints
/v1/rails/configs - Returns the list of available guardrails configurations api.py:277-303

/v1/chat/completions - The main endpoint for chat completions with guardrails applied api.py:369-374

This contradicts the official documentation I mentioned earlier here.
I'm going to run a few more checks to see that the API truly behaves the way its expected, but I don't like the discrepancy with the official documentation, I'm going to run things through with the original writer of the issue to see if he notices similar behavior on his instances

@mattf
Copy link
Collaborator

mattf commented Nov 24, 2025

@jiayin-nvidia @rmkraus can you shed some light on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants