feat: Add Cisco AI Defense integration #1433

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

rucpande wants to merge 3 commits into NVIDIA-NeMo:develop from rucpande:feature/cisco-ai-defense-integration-1420

+1,773 −0

rucpande commented Oct 1, 2025

Description

Add support for Cisco AI Defense Security, Privacy, and Safety guardrails as a third party API.

Features

Input Protection: Inspect user prompts before processing
Output Protection: Inspect bot responses before delivery
Configuration: Environment-based API configuration
Rails Exceptions: Support for enable_rails_exceptions mode
Logging: Logging for debugging and monitoring

Related Issue(s)

Fixes feature: Add support for Cisco AI Defense API as a guardrail provider for both input (prompt) and output (response) protection in NeMo Guardrails. #1420

Checklist

I've read the CONTRIBUTING guidelines.
I've updated the documentation if applicable.
I've added tests if applicable.
@cparisien thanks for reviewing!


          feat: Add Cisco AI Defense integration

87621aa

- Add AI Defense action for input/output protection
- Add documentation for setup and configuration
- Support for environment-based API key configuration

Fixes NVIDIA-NeMo#1420

Contributor

github-actions bot commented Oct 1, 2025

Documentation preview

https://nvidia-nemo.github.io/Guardrails/review/pr-1433

codecov-commenter commented Oct 1, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Pouyanpi requested a review from Copilot

October 2, 2025 13:44

Copilot AI reviewed

View reviewed changes

Copilot AI left a comment

Pull Request Overview

This PR adds Cisco AI Defense integration to NeMo Guardrails, providing security guardrails for input and output protection. The integration enables inspection of user prompts and bot responses through Cisco's AI Defense API to detect and block potentially harmful content.

Key changes:

Implementation of AI Defense inspection actions and flows for input/output protection
Configuration support through environment variables (API key and endpoint)
Comprehensive test coverage including unit, integration, and error handling tests

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`nemoguardrails/library/ai_defense/actions.py`	Core AI Defense inspection action with HTTP client for API calls
`nemoguardrails/library/ai_defense/flows.v1.co`	Colang v1.0 flow definitions for input/output protection
`nemoguardrails/library/ai_defense/flows.co`	Colang v2.0 flow definitions for input/output protection
`tests/test_ai_defense.py`	Comprehensive test suite covering unit, integration, and error scenarios
`docs/user-guides/community/ai-defense.md`	User documentation for setup and usage
`docs/user-guides/guardrails-library.md`	Integration into main guardrails library documentation
`examples/configs/ai_defense/config.yml`	Example configuration file
`examples/configs/ai_defense/README.md`	Example documentation

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

tests/test_ai_defense.py Outdated Show resolved Hide resolved

tests/test_ai_defense.py Outdated Show resolved Hide resolved

nemoguardrails/library/ai_defense/actions.py Outdated Show resolved Hide resolved

cparisien requested review from Pouyanpi, erickgalinkin and tgasser-nv

October 2, 2025 18:19


          Address PR review comments:

0d9e2a4

- Remove placeholder comment in test_real_api_call_with_safe_output
- Remove debug print statements from test code
- Fix incorrect docstring in ai_defense_text_mapping function~

tgasser-nv reviewed

View reviewed changes

Collaborator

tgasser-nv left a comment

Looks great! Mostly naming nits to address.

Could you also run a local integration test (pytest -m integration) with AI_DEFENSE_API_ENDPOINT and AI_DEFENSE_API_KEY set and copy the result into the description?

nemoguardrails/library/ai_defense/actions.py Outdated

		log = logging.getLogger(__name__)


		def ai_defense_text_mapping(result: dict) -> bool:

Collaborator

tgasser-nv Oct 6, 2025

Could you add a type-annotation here (would dict[str, Any] work with the response?)

nit: Maybe rename to indicate the polarity of the bool returned, i.e. is_ai_defense_text_blocked()? So a True means blocked and False is ok

Author

rucpande Oct 6, 2025

Thanks for the review! I've made the suggested changes.

nemoguardrails/library/ai_defense/actions.py Outdated

+                  user_prompt: Optional[str] = None, bot_response: Optional[str] = None, **kwargs
+              ):
+                  api_key = os.environ.get("AI_DEFENSE_API_KEY")
+                  if api_key is None:

Collaborator

tgasser-nv Oct 6, 2025

nit: Maybe change to if not api_key to catch if the AI_DEFENSE_API_KEY is a falsy value like ""?

Author

rucpande Oct 6, 2025

good point

nemoguardrails/library/ai_defense/actions.py Outdated Show resolved Hide resolved

nemoguardrails/library/ai_defense/actions.py Outdated Show resolved Hide resolved

nemoguardrails/library/ai_defense/actions.py Outdated Show resolved Hide resolved


          Address review comments. Add configurable timeout and fail_open setti…

36a77a6

…ngs.

Pouyanpi reviewed

View reviewed changes

nemoguardrails/library/ai_defense/actions.py

    
                  Expects result to be a dict with:

                    - "is_blocked": a boolean indicating if the prompt or response sent to AI Defense should be blocked.

                  Returns:

Collaborator

Pouyanpi Oct 7, 2025

is it the intent?

# default to not blocked (safe/fail-open) if is_blocked is missing

then shouldn't we change the default value to False?

is_blocked = result.get("is_blocked", False)

Author

rucpande Oct 7, 2025

Cleaning this up to remove is_blocked and keep jsut a single value and have it default to fail closed.

Pouyanpi reviewed

View reviewed changes

nemoguardrails/library/ai_defense/actions.py

    
                  else:

                      msg = "Either user_prompt or bot_response must be provided"

                      log.error(msg)

                      raise ValueError(msg)

Collaborator

Pouyanpi Oct 7, 2025

No timeout configured. If we expect the AI Defense API hangs for any reason, this will block indefinitely.

Author

rucpande Oct 7, 2025

You may have started your review before my most recent changes where I added a timeout (and a fail open config). Please let me know if that's not the case and I'm still missing something.

Pouyanpi reviewed

View reviewed changes

nemoguardrails/library/ai_defense/actions.py

    
                  payload: Dict[str, Any] = {"messages": messages}

                  if metadata:

                      payload["metadata"] = metadata

Collaborator

Pouyanpi Oct 7, 2025

code assumes data.get("is_safe") exists but doesn't validate the response structure. If API returns unexpected format, this could fail silently.

Author

rucpande Oct 7, 2025 •

edited

Loading

Same as above, added those checks in my commit from yesterday.

Pouyanpi reviewed

View reviewed changes

nemoguardrails/library/ai_defense/actions.py

    
                          if fail_open:

                              # Fail open: allow content when API call fails

                              log.warning(

                                  "AI Defense API call failed, but fail_open=True, allowing content"

Collaborator

Pouyanpi Oct 7, 2025

both is_blocked and is_safe are returned, which are redundant (one is just not of the other). Are you expecting this to evolve differently?

Author

rucpande Oct 7, 2025 •

edited

Loading

will clean it up so it only uses is_safe.

Pouyanpi reviewed

View reviewed changes

tests/test_ai_defense.py

    
                  # Action error should be handled by the runtime and surface as a generic error message

                  chat >> "Hello"

                  chat << "I'm sorry, an internal error has occurred."

Collaborator

Pouyanpi Oct 7, 2025

nice test 👍🏻 I suggest to have a similar test for Colang 2.0.

I expect to see some issues which are not necessarily related to your PR but we might be able to resolve it.

you can see some example of colang 2.0 configs:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet