Skip to content

Commit abb8bd0

Browse files
added disclaimer
1 parent 9d8a931 commit abb8bd0

File tree

1 file changed

+8
-4
lines changed

1 file changed

+8
-4
lines changed

docs/guardrails/prompt-injections.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,18 +26,22 @@ Guardrails provides the functions `prompt_injection` and `unicode` to detect and
2626
## prompt_injection <span class="detector-badge"/>
2727
```python
2828
def prompt_injection(
29-
data: str | list[str],
30-
config: dict | None = None
29+
data: str | list[str]
3130
) -> bool
3231
```
33-
Detects if a given piece of text contains a prompt injection attempt.
32+
Attempts to detect whether a given piece of text contains a prompt injection attempt, using a [classifier model](https://huggingface.co/protectai/deberta-v3-base-prompt-injection-v2).
33+
34+
!!! danger "Important Disclaimer on Prompt Injection Detectors"
35+
36+
Classifier-based prompt injection detection is only a heuristic, and relying solely on the classifier is not sufficient to prevent the security vulnerabilities in your agent system.
37+
38+
Instead, please consider applying [data flow controls](./dataflow-rules.md) and precise [tool call scoping](./tool-calls.md), to secure your agent, even in the presence of potentially adversarial inputs. Classifier-based detectors can never be trusted to be 100% accurate, and should only be used as a first line of defense.
3439

3540
**Parameters**
3641

3742
| Name | Type | Description |
3843
|-------------|--------|----------------------------------------|
3944
| `data` | `str | list[str]` | A single message or a list of messages to detect prompt injections in. |
40-
| `entities` | `dict | None` | A list of [PII entity types](https://microsoft.github.io/presidio/supported_entities/) to detect. Defaults to detecting all types. |
4145

4246
**Returns**
4347

0 commit comments

Comments
 (0)