fix(security): run unicode normalization before secret redaction#1178
Conversation
Site previewPreview: https://3c1015ff-site.fullsend-ai.workers.dev Commit: |
ReviewFindingsLow
Previous runReviewFindingsMedium
Low
Previous run (2)ReviewFindingsNo findings. The core fix correctly swaps hook ordering in Additional hardening — expanded zero-width regex (aligned between Go and Python), ECMA-48 compliant ANSI CSI regex, OSC escape handling, supplementary variation selector stripping, and post-NFKC escape re-scanning — is well-scoped and strengthens the normalizer against adjacent evasion vectors. Tests are thorough: Previous runReviewFindingsLow
Info
Previous run (3)ReviewFindingsNo findings. The core fix correctly swaps hook ordering in Additional hardening — expanded zero-width regex (aligned between Go and Python), ECMA-48 compliant ANSI CSI regex, OSC escape handling, supplementary variation selector stripping, and post-NFKC escape re-scanning — is well-scoped and strengthens the normalizer against adjacent evasion vectors. Tests are thorough: Previous run (4)ReviewFindingsLow
Info
|
ralphbean
left a comment
There was a problem hiding this comment.
LGTM. One minor note inline.
waynesun09
left a comment
There was a problem hiding this comment.
10-Agent Review Squad — #1178
Agents dispatched: 10 (3x claude-coder, 3x claude-researcher, 2x gemini-code-review, 2x cursor-code-review)
Verified findings: 8 MEDIUM+ (1 CRITICAL, 3 HIGH, 4 MEDIUM) — 4 false positives removed after code verification
Summary
The core reordering fix is correct and well-tested across all three execution layers (Go pipeline, sandbox hooks, CLI scanning). The hook ordering invariant is well-documented and the assert.Less index-comparison test pattern is robust. All direct NewSecretRedactor() callers have been migrated to OutputPipeline().
However, the zero-width character coverage has gaps (U+200E/F, U+180E, U+206A-206F) that allow the same class of attack to succeed with different invisible characters — this should be addressed before merge.
Additional finding not in diff
MEDIUM — post-comment and post-review commands skip output sanitization entirely
internal/cli/postcomment.go and internal/cli/postreview.go read agent output and post directly to the GitHub API without calling OutputPipeline(). While output files should have been scanned during fullsend run, if these commands are invoked standalone (as documented in their --help), they could post unsanitized content containing leaked secrets or zero-width obfuscated tokens. Pre-existing gap, but issue #444's scope explicitly calls for auditing all output paths.
Suggestion: File a follow-up issue to add OutputPipeline().Scan() to these commands before they post to the forge API.
Findings by severity
| Severity | Count | Key Theme |
|---|---|---|
| CRITICAL | 1 | Incomplete invisible char coverage (U+200E/F, U+180E, U+206A-206F bypass) |
| HIGH | 3 | Missing empty-Sanitized guard in run.go, Go/Python ANSI regex parity, missing post-NFKC rescan |
| MEDIUM | 4 | Stale doc/log in run.go, unscanned post-comment/post-review paths, test stderr discarded, supplementary variation selectors |
See inline comments for details and suggested fixes.
|
Addressed review feedback in 61b7351 + 674340d: Hook ordering (original #444 fix) — unchanged; Review follow-ups:
Deferred (follow-up): CI green on latest push. Ready for re-review. |
|
Replied to and resolved all inline threads addressed in 61b7351 / 674340d. Deferred to follow-up issues (from 10-agent review — out of scope for this PR):
These commands can post unsanitized content when invoked standalone outside |
waynesun09
left a comment
There was a problem hiding this comment.
8-Agent Review Squad — 3 verified MEDIUM findings
Agents: 2x claude-coder, 2x claude-researcher, 2x gemini-code-review, 2x cursor-code-review
The core security fix is correct — hook reordering closes the zero-width character bypass (#444), expanded Unicode coverage is comprehensive, and Go/Python regex parity is aligned. CI is green. No merge-blocking issues.
After verification, 7 false positives were removed (pre-existing issues outside PR scope, misidentified Python as Go, etc.). 3 MEDIUM findings remain as inline comments:
- Empty-string
Sanitizedbypass —Pipeline.Scanconflates""(sanitized-to-empty) with""(no-changes), and CLI fallback writes original text back. Low practical risk but a semantic bug. (6/8 agents agreed) stripTerminalEscapesdouble-scan —FindAllString+ReplaceAllStringruns each regex twice; single-pass withReplaceAllStringFunchalves the work on the hot path. (2/8 agents agreed)- Missing wrong-order negative test — No Go test proves
NewPipeline(SecretRedactor, UnicodeNormalizer)(wrong order) fails to catch obfuscated tokens. (2/8 agents agreed)
674340d to
4eefa31
Compare
|
Rebased onto latest |
4eefa31 to
3ac5b2c
Compare
waynesun09
left a comment
There was a problem hiding this comment.
Multi-agent review (4 agents: claude-coder, claude-researcher, gemini-code-review, cursor-code-review). 8 verified findings after dedup and false-positive removal. See inline comments for MEDIUM+ issues.
Zero-width characters interleaved in token-shaped strings bypassed secret regexes when redaction ran before unicode stripping in sandbox post-tool hooks. Reorder hooks to normalize first, extend OutputPipeline and output scanning paths, and add regression tests. Fixes fullsend-ai#444 Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Barak Korren <bkorren@redhat.com>
Expand invisible-character stripping (U+200E/F, U+180E, etc.), align Go ANSI/OSC handling with Python including post-NFKC rescan, add empty Sanitized guards and consistent scan output logging. Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Barak Korren <bkorren@redhat.com>
Only fall back to original text when the pipeline reports no findings, use single-pass terminal escape stripping, and add wrong-order pipeline test. Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Barak Korren <bkorren@redhat.com>
Remove dead Sanitized fallback guards, expand invisible-character coverage (U+061C, line/paragraph separators, Cf sweep), generalize ST-terminated escape stripping, and add wrong-order/fullwidth chain tests. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
3ac5b2c to
a4f164e
Compare
|
Rebased onto latest |
waynesun09
left a comment
There was a problem hiding this comment.
All 13 review items verified as substantively addressed across 3 fix rounds. Line-by-line review of final diff confirms:
- Hook ordering invariant enforced and tested (unicode before secret_redact)
- Go/Python regex parity aligned (zero_width, ANSI, ST-terminated)
- Empty-Sanitized edge case handled correctly (dead-code guard removed)
- Post-NFKC escape rescan added to Go (parity with Python)
- Cf category sweep and supplementary VS stripping in place
- 6 new regression tests covering ZWNJ, LTR mark, ALM, fullwidth, and wrong-order scenarios
CI green across all checks.
Summary
unicode_posttool.pyruns beforesecret_redact_posttool.py, preventing zero-width characters from evading secret regexes and reconstructing tokens in LLM context.UnicodeNormalizertoOutputPipeline()and use it in host output scanning (scan output,scanOutputFiles).Fixes #444
Test plan
go test ./internal/security/... ./internal/cli/...python3 -m unittest internal/security/hooks/posttool_chain_test.pypython3 -m unittest internal/security/hooks/unicode_posttool_test.pyMade with Cursor