fix(security): run unicode normalization before secret redaction by ifireball · Pull Request #1178 · fullsend-ai/fullsend

ifireball · 2026-05-19T12:57:18Z

Summary

Reorder sandbox post-tool hooks so unicode_posttool.py runs before secret_redact_posttool.py, preventing zero-width characters from evading secret regexes and reconstructing tokens in LLM context.
Add UnicodeNormalizer to OutputPipeline() and use it in host output scanning (scan output, scanOutputFiles).
Add regression tests for hook ordering, Go pipeline behavior, and Python hook chaining.

Fixes #444

Test plan

go test ./internal/security/... ./internal/cli/...
python3 -m unittest internal/security/hooks/posttool_chain_test.py
python3 -m unittest internal/security/hooks/unicode_posttool_test.py

Made with Cursor

Zero-width characters interleaved in token-shaped strings bypassed secret regexes when redaction ran before unicode stripping in sandbox post-tool hooks. Reorder hooks to normalize first, extend OutputPipeline and output scanning paths, and add regression tests. Fixes fullsend-ai#444 Co-authored-by: Cursor <[email protected]>

github-actions · 2026-05-19T12:59:07Z

Site preview

Preview: https://696ce045-site.fullsend-ai.workers.dev

Commit: 674340d95392b8adec233f8f8102c3eab33dcc34

fullsend-ai-review · 2026-05-19T13:05:16Z

Review

Findings

No findings.

The core fix correctly swaps hook ordering in hooks.go so unicode_posttool.py runs before secret_redact_posttool.py, preventing zero-width characters from evading prefix regexes. The Go-side OutputPipeline() is updated to include UnicodeNormalizer before SecretRedactor, aligning host-side output scanning with the sandbox hook ordering invariant.

Additional hardening — expanded zero-width regex (aligned between Go and Python), ECMA-48 compliant ANSI CSI regex, OSC escape handling, supplementary variation selector stripping, and post-NFKC escape re-scanning — is well-scoped and strengthens the normalizer against adjacent evasion vectors.

Tests are thorough: TestGenerateClaudeSettings_PostToolSanitizeHookOrder asserts the ordering invariant structurally, TestPipeline covers zero-width and LTR-mark obfuscated PATs through the Go pipeline, and posttool_chain_test.py validates the Python hook chain end-to-end including a negative test proving redaction alone misses obfuscated tokens.

Previous run

Review

Findings

Low

[correctness] internal/cli/run.go:1382 — scanOutputFiles writes result.Sanitized to disk without the defensive empty-string fallback that was added to scan.go. Both call sites now use OutputPipeline(), but only scan.go guards against the theoretical edge case where result.Sanitized is empty despite findings existing (which would truncate the file to zero bytes). Current scanners always populate Sanitized when they report findings, so this is not exploitable today, but the inconsistency is worth aligning for robustness.
Remediation: Add the same fallback pattern used in scan.go — if result.Sanitized is empty, fall back to the original content before writing.

Info

[style] internal/cli/scan.go:163-167 — The Long description for scan output ("Reads text from stdin and scans for API keys, tokens, credentials, and sensitive patterns") no longer reflects the full pipeline behavior, which now includes unicode normalization before secret redaction.

ralphbean

LGTM. One minor note inline.

waynesun09

10-Agent Review Squad — #1178

Agents dispatched: 10 (3x claude-coder, 3x claude-researcher, 2x gemini-code-review, 2x cursor-code-review)
Verified findings: 8 MEDIUM+ (1 CRITICAL, 3 HIGH, 4 MEDIUM) — 4 false positives removed after code verification

Summary

The core reordering fix is correct and well-tested across all three execution layers (Go pipeline, sandbox hooks, CLI scanning). The hook ordering invariant is well-documented and the assert.Less index-comparison test pattern is robust. All direct NewSecretRedactor() callers have been migrated to OutputPipeline().

However, the zero-width character coverage has gaps (U+200E/F, U+180E, U+206A-206F) that allow the same class of attack to succeed with different invisible characters — this should be addressed before merge.

Additional finding not in diff

MEDIUM — post-comment and post-review commands skip output sanitization entirely

internal/cli/postcomment.go and internal/cli/postreview.go read agent output and post directly to the GitHub API without calling OutputPipeline(). While output files should have been scanned during fullsend run, if these commands are invoked standalone (as documented in their --help), they could post unsanitized content containing leaked secrets or zero-width obfuscated tokens. Pre-existing gap, but issue #444's scope explicitly calls for auditing all output paths.

Suggestion: File a follow-up issue to add OutputPipeline().Scan() to these commands before they post to the forge API.

Findings by severity

Severity	Count	Key Theme
CRITICAL	1	Incomplete invisible char coverage (U+200E/F, U+180E, U+206A-206F bypass)
HIGH	3	Missing empty-Sanitized guard in run.go, Go/Python ANSI regex parity, missing post-NFKC rescan
MEDIUM	4	Stale doc/log in run.go, unscanned post-comment/post-review paths, test stderr discarded, supplementary variation selectors

See inline comments for details and suggested fixes.

Expand invisible-character stripping (U+200E/F, U+180E, etc.), align Go ANSI/OSC handling with Python including post-NFKC rescan, add empty Sanitized guards and consistent scan output logging. Co-authored-by: Cursor <[email protected]>

Co-authored-by: Cursor <[email protected]>

ifireball · 2026-05-20T06:15:46Z

Addressed review feedback in 61b7351 + 674340d:

Hook ordering (original #444 fix) — unchanged; unicode_posttool still runs before secret_redact_posttool.

Review follow-ups:

Expanded invisible-character coverage in Go UnicodeNormalizer and Python unicode_posttool.py (U+200E/F, U+180E, U+034F, U+206A–206F, U+FFF9–FFFB, supplementary VS U+E0100–E01EF)
Aligned Go ANSI/OSC regex with Python; added post-NFKC escape rescan in Go
Added empty-Sanitized guard in scanOutputFiles; aligned doc comments and log messages with scan output
Chain tests: LTR-mark obfuscation case, stderr in hook test errors

Deferred (follow-up): post-comment / post-review standalone paths still skip OutputPipeline() — pre-existing gap, suggest separate issue.

CI green on latest push. Ready for re-review.

ifireball · 2026-05-20T06:39:30Z

Replied to and resolved all inline threads addressed in 61b7351 / 674340d.

Deferred to follow-up issues (from 10-agent review — out of scope for this PR):

security: run OutputPipeline on post-comment before forge API post #1229 — run OutputPipeline() on fullsend post-comment before forge API post
security: run OutputPipeline on post-review before forge API post #1230 — run OutputPipeline() on fullsend post-review before forge API post

These commands can post unsanitized content when invoked standalone outside fullsend run.

waynesun09

8-Agent Review Squad — 3 verified MEDIUM findings

Agents: 2x claude-coder, 2x claude-researcher, 2x gemini-code-review, 2x cursor-code-review

The core security fix is correct — hook reordering closes the zero-width character bypass (#444), expanded Unicode coverage is comprehensive, and Go/Python regex parity is aligned. CI is green. No merge-blocking issues.

After verification, 7 false positives were removed (pre-existing issues outside PR scope, misidentified Python as Go, etc.). 3 MEDIUM findings remain as inline comments:

Empty-string Sanitized bypass — Pipeline.Scan conflates "" (sanitized-to-empty) with "" (no-changes), and CLI fallback writes original text back. Low practical risk but a semantic bug. (6/8 agents agreed)
stripTerminalEscapes double-scan — FindAllString + ReplaceAllString runs each regex twice; single-pass with ReplaceAllStringFunc halves the work on the hot path. (2/8 agents agreed)
Missing wrong-order negative test — No Go test proves NewPipeline(SecretRedactor, UnicodeNormalizer) (wrong order) fails to catch obfuscated tokens. (2/8 agents agreed)

waynesun09 · 2026-05-21T20:24:06Z

 			}
-			if writeErr := os.WriteFile(path, []byte(result.Sanitized), 0o644); writeErr != nil {
-				printer.StepWarn(fmt.Sprintf("Could not write redacted %s: %v", relPath, writeErr))
+			out := result.Sanitized
+			if out == "" {
+				out = text


[MEDIUM] Empty-string fallback can bypass sanitization (6/8 review agents flagged this)

When UnicodeNormalizer strips ALL characters from input (e.g., text consisting entirely of zero-width chars), it sets Sanitized = "". Pipeline.Scan interprets Sanitized == "" as "no changes" (scanner.go:69) and skips updating current, so the original text passes through unsanitized. This fallback then writes the original unsanitized text back.

The practical security impact is low (requires input with ONLY invisible characters — no secrets to leak), but it's a semantic bug in the Pipeline abstraction that this PR makes newly reachable through OutputPipeline.

Suggestion: Guard on findings count:

out := result.Sanitized if out == "" && len(result.Findings) > 0 { // Pipeline stripped everything — don't revert to original printer.StepWarn(fmt.Sprintf("Sanitized output empty despite %d finding(s) in %s", len(result.Findings), relPath)) } if out == "" && len(result.Findings) == 0 { out = text }

Or better, add a Modified bool field to ScanResult so Pipeline.Scan can distinguish "no changes" from "sanitized to empty string".

waynesun09 · 2026-05-21T20:24:09Z

+	if matches := reANSI.FindAllString(current, -1); len(matches) > 0 {
+		ansiCount = len(matches)
+		current = reANSI.ReplaceAllString(current, "")
+	}
+	if matches := reOSC.FindAllString(current, -1); len(matches) > 0 {
+		oscCount = len(matches)
+		current = reOSC.ReplaceAllString(current, "")
+	}


[MEDIUM] Double regex scan — FindAllString then ReplaceAllString (2/8 agents flagged)

Each regex is compiled and executed twice over the text: once to count matches, once to remove them. Since stripTerminalEscapes is called up to twice per Scan() invocation (pre- and post-NFKC), and runs on every Bash/Read/WebFetch result in the sandbox, this is 4-8 regex passes where 2-4 would suffice.

Suggestion: Single-pass with ReplaceAllStringFunc:

func stripTerminalEscapes(text string) (string, int, int) { ansiCount := 0 current := reANSI.ReplaceAllStringFunc(text, func(string) string { ansiCount++; return "" }) oscCount := 0 current = reOSC.ReplaceAllStringFunc(current, func(string) string { oscCount++; return "" }) return current, ansiCount, oscCount }

waynesun09 · 2026-05-21T20:24:10Z

+
 	t.Run("clean text passes both", func(t *testing.T) {
 		p := InputPipeline()
 		r := p.Scan("Normal commit message fixing a null pointer bug.")


[MEDIUM] Missing wrong-order negative test (2/8 agents flagged)

These tests prove the correct pipeline order catches obfuscated tokens, but there's no test proving the wrong order (NewPipeline(NewSecretRedactor(), NewUnicodeNormalizer())) fails to catch them. A wrong-order test directly validates the ordering invariant documented in hooks.go:125-127 and would catch regressions if someone accidentally swaps the pipeline composition in OutputPipeline().

Suggestion:

t.Run("wrong order leaks zero-width obfuscated PAT", func(t *testing.T) { p := NewPipeline(NewSecretRedactor(), NewUnicodeNormalizer()) plain := "ghp_FAKEtesttoken000000000000000000000000" var obfuscated strings.Builder for _, r := range plain { obfuscated.WriteRune(r) obfuscated.WriteRune('‌') } r := p.Scan(obfuscated.String()) // Redactor runs first, sees obfuscated token, misses it assert.True(t, hasFinding(r, "zero_width")) assert.False(t, hasFinding(r, "github_pat"), "wrong order must NOT catch the obfuscated token") })

ralphbean

LGTM.

ifireball requested review from maruiz93, ralphbean, rh-hemartin and waynesun09 and removed request for waynesun09 May 19, 2026 12:57

ifireball self-assigned this May 19, 2026

ifireball marked this pull request as ready for review May 19, 2026 12:58

github-actions Bot deployed to site-preview May 19, 2026 12:59 View deployment

rh-hemartin approved these changes May 19, 2026

View reviewed changes

ralphbean approved these changes May 19, 2026

View reviewed changes

Comment thread internal/cli/run.go Outdated

waynesun09 requested changes May 19, 2026

View reviewed changes

Comment thread internal/security/scanner.go

Comment thread internal/security/scanner.go

Comment thread internal/cli/run.go Outdated

Comment thread internal/cli/run.go

Comment thread internal/security/hooks/posttool_chain_test.py Outdated

github-actions Bot deployed to site-preview May 20, 2026 06:11 View deployment

style: ruff-format unicode_posttool.py

674340d

Co-authored-by: Cursor <[email protected]>

github-actions Bot deployed to site-preview May 20, 2026 06:14 View deployment

fullsend-ai-review Bot approved these changes May 20, 2026

View reviewed changes

fullsend-ai-review Bot added the ready-for-merge All reviewers approved — ready to merge label May 20, 2026

This was referenced May 20, 2026

security: run OutputPipeline on post-comment before forge API post #1229

Open

security: run OutputPipeline on post-review before forge API post #1230

Open

ralphbean requested a review from waynesun09 May 21, 2026 19:36

waynesun09 reviewed May 21, 2026

View reviewed changes

ralphbean approved these changes May 21, 2026

View reviewed changes

Conversation

ifireball commented May 19, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Site preview

Uh oh!

fullsend-ai-review Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review

Findings

Review

Findings

Low

Info

Uh oh!

ralphbean left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

waynesun09 left a comment

Choose a reason for hiding this comment

10-Agent Review Squad — #1178

Summary

Additional finding not in diff

Findings by severity

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ifireball commented May 20, 2026

Uh oh!

ifireball commented May 20, 2026

Uh oh!

waynesun09 left a comment

Choose a reason for hiding this comment

8-Agent Review Squad — 3 verified MEDIUM findings

Uh oh!

waynesun09 May 21, 2026

Choose a reason for hiding this comment

Uh oh!

waynesun09 May 21, 2026

Choose a reason for hiding this comment

Uh oh!

waynesun09 May 21, 2026

Choose a reason for hiding this comment

Uh oh!

ralphbean left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions Bot commented May 19, 2026 •

edited

Loading

fullsend-ai-review Bot commented May 19, 2026 •

edited

Loading