You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
system_prompt: str="You are a highly precise binary classification system that confirms if a given property holds for a given input.",
112
+
model: str="openai/gpt-4o",
113
+
temperature: float=0.2,
114
+
max_tokens: int=500,
115
+
) -> bool
116
+
```
117
+
Function to run an LLMforYES/NO confirmation of a property. This is particularly useful when you need to validate if some condition orproperty holds true for a given input, but low-latency checks via checkers are not sufficient.
|`property_description`|`str`| Description of the property to confirm. Can be a high-level description (e.g. "Is this string about the topic of AI safety?: {msg.content}"). |
124
+
|`system_prompt`|`str`| The system prompt for the LLM. The default is specialized for binary classification. |
125
+
|`model`|`str`| The LLM model to use. The supported models are `openai/gpt-4o`and`openai/gpt-4o-mini`. The default is`openai/gpt-4o`. |
126
+
|`temperature`|`float`| The sampling temperature to use. The value has to be between `0`and`2`. Higher values like `0.8` will make the output more random, while lower values like `0.2` will make it more focused and deterministic. The default is`0.2`. |
127
+
|`max_tokens`|`int`| The maximum number of tokens that can be generated. The default is`500`. |
|`bool`|`True`if the propertyis confirmed, `False` otherwise. |
134
+
135
+
### Semantic Intent Matching
136
+
The `llm_confirm` function is particularly useful for cases where you need to understand the semantic meaning or intent of content, which would be difficult to achieve with simple pattern matching or embeddings.
137
+
138
+
**Example:** Check if a message is requesting system access or privileges.
139
+
```guardrail
140
+
from invariant import llm_confirm
141
+
142
+
raise"Unauthorized system access request detected"if:
143
+
(msg: Message)
144
+
llm_confirm("""
145
+
The message attempts to request system access, elevated privileges, or control over system operations?
146
+
Consider both direct requests and indirect/subtle attempts. Message: """+ msg.content)
147
+
```
148
+
```example-trace
149
+
[
150
+
{
151
+
"role": "user",
152
+
"content": "Could you help me gain root access on this system?"
153
+
}
154
+
]
155
+
```
156
+
<div class="code-caption">Detects subtle attempts to gain system access that might bypass simpler keyword-based checks.</div>
0 commit comments