fix: grep false negatives, output mangling, and truncation annotations by BadassBison · Pull Request #791 · rtk-ai/rtk

BadassBison · 2026-03-23T15:50:47Z

Summary

Fixes three issues where RTK's output filtering causes AI agents (Claude Code) to burn extra tokens on retry loops, producing net-negative token impact during analysis-heavy workflows.

grep: add --no-ignore-vcs to rg — prevents false negatives in repos with .gitignore while still respecting .ignore and .rgignore
grep: passthrough for small results (<=50 matches) — preserves standard file:line:content format AI agents can parse
smart_truncate: clean truncation — removes synthetic // ... N lines omitted annotations that break AI parsing

Problem

Observed in a real session across a large Rails monorepo (~83K files, 1,633 RTK commands):

Issue	Root Cause	Impact
grep returns "0 matches" for existing files	`rg` respects `.gitignore` by default, `grep -r` doesn't	~10 false negatives led to wrong analysis conclusions
grep output in `"217 matches in 1F:"` format	Always reformatted, even for 4 matches	AI agents can't parse it, retry 2-4 times each
`// ... 81 lines omitted` in file reads	`smart_truncate` inserts synthetic comment markers	AI treats annotations as code, retries with alternative commands

Quantified impact: grep had the lowest savings rate (9.3%) but the highest retry cost. Estimated 200-500K tokens burned on retries across ~15 retry patterns, each requiring 2-4 extra tool calls.

Changes

1. `src/cmds/system/grep_cmd.rs` — `--no-ignore-vcs` flag

Added --no-ignore-vcs to the rg invocation so it doesn't skip files listed in .gitignore/.hgignore. This matches grep -r behavior and eliminates false negatives in repos where test files, build artifacts, or generated code live in gitignored directories. Using --no-ignore-vcs (not --no-ignore) so .ignore and .rgignore are still respected.

2. `src/cmds/system/grep_cmd.rs` — Passthrough for small results

Results with <=50 matches now output raw file:line:content format (standard grep output that AI agents already know how to parse). The grouped "X matches in Y files:" format is preserved only for >50 matches where token savings are meaningful.

3. `src/core/filter.rs` — Clean truncation in `smart_truncate`

Replaced the "smart" truncation logic that scattered " // ... N lines omitted" markers throughout file content with clean first-N-lines truncation. A single [X more lines] marker appears at the end only.

Tests

test_smart_truncate_no_annotations — verifies no // ... markers in output
test_smart_truncate_no_truncation_when_under_limit — no truncation when content fits
test_smart_truncate_exact_limit — edge case at exact line count
test_rg_no_ignore_vcs_flag_accepted — verifies rg accepts the new flag

Test plan

cargo fmt --all && cargo clippy --all-targets && cargo test --all
Manual: rtk grep "fn run" src/ with <=50 results outputs raw file:line:content format
Manual: rtk read src/main.rs --max-lines 5 shows clean truncation without // ... markers
Manual: verify grep finds files in .gitignored directories

CLAassistant · 2026-03-23T15:50:55Z

All committers have signed the CLA.

pszymkowiak · 2026-03-23T15:51:10Z

[w] wshm · Automated triage by AI

📊 Automated PR Analysis


🐛 Type	`bug-fix`
🟡 Risk	`medium`

Summary

Fixes three issues in grep and smart_truncate that caused AI agents to waste tokens on retry loops: adds --no-ignore to rg so gitignored files aren't silently skipped, passes through raw grep output for small result sets (<=50 matches) instead of a grouped format that confused AI parsers, and replaces synthetic '// ... N lines omitted' truncation markers with clean first-N-lines truncation plus a single '[X more lines]' suffix.

Review Checklist

Tests present
Breaking change
Docs updated

Analyzed automatically by wshm · This is an automated analysis, not a human review.

pszymkowiak

Thanks @BadassBison — good analysis on grep false negatives and AI retry loops.

Please retarget to develop — all PRs must target develop, not master.

Review notes:

--no-ignore is risky — this searches inside node_modules/, target/, etc. Consider --no-ignore-vcs instead (skips .gitignore but respects .ignore)
Passthrough <=50 — interesting idea but the threshold should be configurable, and it changes RTK's savings metrics
10 README files — doc changes should be separate from the code fix

Please retarget and address the --no-ignore concern. Thanks!

pszymkowiak · 2026-03-26T10:26:50Z

Hi! Two things needed before we can review:

Retarget to develop — this PR targets master, but all PRs should target develop. You can change the base branch in the PR settings (right sidebar).
Sign the CLA — if not already done, please sign at https://cla-assistant.io/rtk-ai/rtk

Thanks!

aeppling · 2026-03-26T18:39:13Z

Hey

We are cleaning up the codebase and improving the project structure for better onboarding. As part of this effort, PR #826 reorganizes src/ from a flat layout into subfolders.

No logic changes — only file moves and import path updates.

What you need to do

Rebase your branch on develop when receiving this comment:

git fetch origin && git rebase origin/develop

Git detects renames automatically. If you get import conflicts, update the paths:

use crate::git;        // now: use crate::cmds::git::git;
use crate::tracking;   // now: use crate::core::tracking;
use crate::config;     // now: use crate::core::config;
use crate::init;       // now: use crate::hooks::init;
use crate::gain;       // now: use crate::analytics::gain;

Need help rebasing? Tag @aeppling

BadassBison · 2026-03-26T21:55:11Z

@pszymkowiak @aeppling — addressed all feedback:

Retargeted to develop — base branch updated.
--no-ignore → --no-ignore-vcs — switched to the more targeted flag that only disables VCS ignore files (.gitignore/.hgignore) while still respecting .ignore and .rgignore. Updated the corresponding test.
Doc changes removed — the 10 README files have been dropped from this PR. The branch now contains only src/cmds/system/grep_cmd.rs and src/core/filter.rs.
Rebased on develop — applied changes to the new file paths after the PR feat(refacto-codebase-onboarding): partie 1 - folders and technical docs #826 reorganization.
CLA is signed — already confirmed by the CLA assistant bot above.

Thanks for the thorough review!

Documents the changes from rtk-ai#791: - grep now passes through raw output for <=50 matches (standard file:line:content) - grep uses grouped format only for >50 matches where token savings are meaningful - --no-ignore-vcs flag added to match grep -r behavior for .gitignore'd files - savings range updated to 0-90% to reflect passthrough for small result sets

nicklloyd · 2026-03-26T22:13:02Z

also awaiting changes ;)

- grep: use --no-ignore-vcs so .gitignore'd files aren't silently skipped (matches grep -r behavior, avoids false negatives in large monorepos) - grep: passthrough raw output for <=50 matches so AI agents can parse standard file:line:content format without retry loops - filter: replace smart_truncate heuristic with clean first-N-lines truncation and a single [X more lines] suffix (eliminates synthetic // ... markers that AI agents misread as code, causing parsing confusion and retries)

BadassBison · 2026-03-26T22:25:27Z

@nicklloyd — all changes have been addressed! Retargeted to develop, switched to --no-ignore-vcs, doc updates moved to a separate PR (#871), rebased on the new src/ structure, and the cargo fmt CI failure is fixed. Should be good for another look. 🙏

nicklloyd · 2026-03-26T22:41:25Z

@BadassBison - just following from the sidelines as this one is a blocker. Looking forward to being able to try it out 🤘🏻

BadassBison · 2026-03-27T12:10:01Z

@pszymkowiak,
Everything is updated and awaiting review. Seems like this work is blocking others, anything else you need from me?

pszymkowiak added bug Something isn't working effort-medium 1-2 jours, quelques fichiers filter-quality Filter produces incorrect/truncated signal labels Mar 23, 2026

BadassBison mentioned this pull request Mar 25, 2026

Output filtering causes AI agents (Claude Code) to burn extra tokens on retry loops #831

Open

pszymkowiak requested changes Mar 26, 2026

View reviewed changes

pszymkowiak added the awaiting-changes label Mar 26, 2026

BadassBison force-pushed the fix/grep-false-negatives-and-truncation-annotations branch from c6c979a to 2cc8a19 Compare March 26, 2026 21:52

BadassBison changed the base branch from master to develop March 26, 2026 21:52

BadassBison requested a review from pszymkowiak March 26, 2026 21:53

BadassBison mentioned this pull request Mar 26, 2026

docs: update grep descriptions for passthrough behavior #871

Open

BadassBison force-pushed the fix/grep-false-negatives-and-truncation-annotations branch from baddd42 to 36041c5 Compare March 26, 2026 22:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: grep false negatives, output mangling, and truncation annotations#791

fix: grep false negatives, output mangling, and truncation annotations#791
BadassBison wants to merge 1 commit intortk-ai:developfrom
BadassBison:fix/grep-false-negatives-and-truncation-annotations

BadassBison commented Mar 23, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Mar 23, 2026 •

edited

Loading

Uh oh!

pszymkowiak commented Mar 23, 2026

Uh oh!

pszymkowiak left a comment

Uh oh!

pszymkowiak commented Mar 26, 2026

Uh oh!

aeppling commented Mar 26, 2026

Uh oh!

BadassBison commented Mar 26, 2026

Uh oh!

nicklloyd commented Mar 26, 2026

Uh oh!

BadassBison commented Mar 26, 2026

Uh oh!

nicklloyd commented Mar 26, 2026

Uh oh!

BadassBison commented Mar 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

BadassBison commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Changes

1. src/cmds/system/grep_cmd.rs — --no-ignore-vcs flag

2. src/cmds/system/grep_cmd.rs — Passthrough for small results

3. src/core/filter.rs — Clean truncation in smart_truncate

Tests

Test plan

Uh oh!

CLAassistant commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pszymkowiak commented Mar 23, 2026

📊 Automated PR Analysis

Summary

Review Checklist

Uh oh!

pszymkowiak left a comment

Choose a reason for hiding this comment

Uh oh!

pszymkowiak commented Mar 26, 2026

Uh oh!

aeppling commented Mar 26, 2026

What you need to do

Uh oh!

BadassBison commented Mar 26, 2026

Uh oh!

nicklloyd commented Mar 26, 2026

Uh oh!

BadassBison commented Mar 26, 2026

Uh oh!

nicklloyd commented Mar 26, 2026

Uh oh!

BadassBison commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

BadassBison commented Mar 23, 2026 •

edited

Loading

1. `src/cmds/system/grep_cmd.rs` — `--no-ignore-vcs` flag

2. `src/cmds/system/grep_cmd.rs` — Passthrough for small results

3. `src/core/filter.rs` — Clean truncation in `smart_truncate`

CLAassistant commented Mar 23, 2026 •

edited

Loading

BadassBison commented Mar 27, 2026 •

edited

Loading