Skip to content

Improve security#396

Open
s-kostyaev wants to merge 29 commits intomainfrom
improve-security
Open

Improve security#396
s-kostyaev wants to merge 29 commits intomainfrom
improve-security

Conversation

@s-kostyaev
Copy link
Copy Markdown
Owner

Make autonomus agents execution more safe by adding safeguards.

Introduce a new `ellama-tools-use-srt` flag to enable running shell‑based tools
inside a sandbox runtime. Add configuration variables for the sandbox program
and its arguments, and implement helper functions `ellama-tools--command-argv`
and `ellama-tools--call-command-to-string` that wrap commands with `srt` when
enabled. Update `ellama-tools-shell-command-tool`, `ellama-tools-grep-tool`, and
`ellama-tools-grep-in-file-tool` to use the new helpers, ensuring consistent
sandbox behavior. Add comprehensive tests for the wrapper logic, error handling
when the sandbox is missing, and the updated grep functions. Also adjust test
setup to load the local `ellama-tools.el`.
The apply_patch tool and its implementation were removed. Corresponding tests
were deleted, and the tools list in the coder role was updated to no longer
include apply_patch. This cleans up unused functionality and simplifies the
toolset.
Implemented local SRT filesystem policy parsing and enforcement for ellama
tools, including caching and path normalization helpers. Updated README.org to
document new srt settings behavior and error handling. Extended the Makefile
with `test-srt-integration`, Docker build, and Linux integration targets. Added
a comprehensive test suite `tests/test-ellama-tools-srt-integration.el` to
verify parity between local checks and the real `srt` runtime, and updated
existing tests to cover policy defaults, rule resolution, and error scenarios.
Implemented new helper functions `ellama-tools--srt-normalize-rule-literal-path`
and
`ellama-tools--srt-normalize-nonexisting-target-path` to better handle
symlinked rules and non‑existent write targets. Updated target normalization
logic to use these helpers and improved directory‑prefix matching. Added
extensive test suite covering:
- Nested write with mkdir
- Move operations (allowed, deny‑read/write, cross‑directory cases)
- Directory‑prefix edge cases
- Tilde expansion parity
- Symlink file and directory parity
- Write existing/nested targets with allow/deny rules
- Move‑file tool behavior with nested destinations
- Missing parent segment handling
- Darwin symlink rule parity

These changes enhance policy consistency, reduce drift, and provide robust
coverage of SRT integration scenarios.
…ptions

Implemented a comprehensive SRT filesystem policy section in README.org,
detailing how non‑shell file tools perform local checks derived from the srt
settings file, supported rules, path matching, and failure conditions. Added
examples for srt config and Emacs setup, and introduced parity test references.
Updated ellama.info to include a new “SRT Filesystem Policy for Tools” node,
expanded the ellama‑tools‑use‑srt description to mention local file checks, and
clarified ellama‑tools‑srt‑args usage. Adjusted the tag table to reflect new
node positions and updated indices.
… user‑error

Updated ellama-tools to return descriptive error strings when SRT policies deny
access. Adjusted all file‑handling functions to propagate these messages via
`or` expressions. Updated documentation and tests to expect string output and
verify policy details. This improves error handling for tool callers and aligns
with new test expectations.
Configure cl-lib loading and load-path modifications for build and manual
targets to prevent conflicts with installed org packages.
Plan comprehensive Data Loss Prevention layer to scan and enforce policy on
ellama tool inputs and outputs, supporting regex patterns and exact secret
detection from environment variables. Add configurable monitor/enforce modes
with sanitization logging, output redaction, and blocking capabilities. Include
complete implementation plan, requirements specification, rollout guide, and
structured arguments handling plan.
Prevent recursion into closures by detecting function values during incident
sanitization. When a value is a function, return the symbol 'function or
'compiled-function to avoid copy-tree errors on Emacs 28. Added explanatory
comments to clarify the change.
Introduced default regex rules that detect sensitive environment variable
references (TOKEN, SECRET, KEY, PASS, PWD, AUTH, COOKIE, CRED, SESSION) in shell
commands. These rules automatically block or warn based on policy configuration.

Modified `ellama-tools-define-tool` to replace existing tools by name rather
than allowing duplicates. Added helper functions `ellama-tools--tool-name=` and
`ellama-tools--remove-tool-by-name` to support this behavior.

Added comprehensive tests for DLP blocking behavior including shell environment
secret references, HTTP secret parameter references, and tool replacement
functionality.
Implemented a comprehensive set of regex rules for detecting prompt‑injection
patterns and integrated them into the DLP engine. Added a new customizable
variable `ellama-tools-dlp-output-warn-behavior` that controls how `warn`
verdicts are treated (allow, confirm, or block). New helper functions
provide confirmation prompts and redaction of tool output. Updated policy
logic to block prompt‑injection findings in tool output by default,
while still respecting explicit overrides. Extended the existing regex
rule set, refactored rule registration, and added extensive tests for
the new behavior and edge cases.
Add ability to preview tool output with sensitive findings highlighted before
deciding whether to allow, redact, or block. Includes new
`ellama-tools--dlp-view-output-warning` function for displaying warnings with
highlighted findings, and updated `ellama-tools--dlp-output-warn-choice` to
support the (v)iew option.
Implemented a dedicated log buffer for Ellama tool calls, adding
`ellama-tools--call-log-buffer-name` and a helper `ellama-tools--log-call` that
records call status, function name, and arguments. Updated
`ellama-tools--confirm-call` to log auto‑accepted calls and to record user
decisions (accepted, rejected) after confirmation prompts. Added test utilities
to clear and read the log buffer, and new unit tests verifying that accepted,
auto‑accepted, and rejected calls are correctly logged.
Removed redundant session buffer fallback logic from ellama-chat function. Added
two new tests to verify streaming defaults to current buffer with active session
and chat writes to session buffer correctly. Fixes #398
Implemented a new line budget system for tool outputs to prevent context
overflow and guide agents when truncation occurs. Added four configuration
variables (enabled, max-lines, max-line-length, save-overflow-file) and helper
functions to detect output source context. The system truncates excessive lines,
marks long lines with truncation notices, and saves full output to temp files
when source is unknown. Updated documentation in README.org and
dlp_rollout_guide.md with new settings and behavior descriptions. Added five new
test cases covering overflow file saving, source file detection, long line
truncation, and DLP scan precedence.
Added a new `test-integration` target for running integration tests and
introduced a `checkdocs` target to validate docstrings and style documentation
using the checkdoc package. Updated the AGENTS.md checklist to reflect the new
workflow steps.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant