Skip to content

Fix fleet-auditor SkillBloat: measure frontmatter only, not full SKILL.md (fixes #16)#17

Merged
alexgreensh merged 1 commit into
alexgreensh:mainfrom
eligrumman:fix/skill-bloat-frontmatter-only
Apr 8, 2026
Merged

Fix fleet-auditor SkillBloat: measure frontmatter only, not full SKILL.md (fixes #16)#17
alexgreensh merged 1 commit into
alexgreensh:mainfrom
eligrumman:fix/skill-bloat-frontmatter-only

Conversation

@eligrumman
Copy link
Copy Markdown
Contributor

Summary

Fixes #16.

fleet-auditor's SkillBloat detector was over-counting skill token overhead by ~19x. It measures the full SKILL.md file per skill, but Claude Code only loads the YAML frontmatter (name + description) into the session at startup. SKILL.md bodies load on demand when the user invokes the skill via the Skill tool.

On a 41-skill fleet this produced a false HIGH-severity $136/mo skill_bloat finding that's really closer to ~$7/mo, drowning out real waste patterns (empty heartbeats, abandoned sessions).

Change

  • Add _estimate_skill_frontmatter_tokens(skill_md) helper that parses the file, extracts the YAML frontmatter block (content between the first two --- markers), and estimates tokens for just that block.
  • Falls back to 100 (the documented average from the detector's own description) if the file has no frontmatter or can't be read.
  • Replace estimate_tokens_from_file(skill_md) with the new helper in ClaudeCodeAdapter.parse_config.
  • No API surface change.

Verification

Tested on a 41-skill Claude Code install. Cross-checked against /context output.

Measurement Before After
Total tokens 75,578 4,011
Avg/skill 1,843 97
Top skill (linkedin) n/a 169
Bottom skill (freqtrade-setup) n/a 40

The post-fix distribution matches what Claude Code's own /context command reports (skills shown as 4-139 tokens each in the context breakdown). The 100-token fallback aligns with the detector's own description string ("~100 tokens each" at fleet.py:749).

Impact on skill_bloat findings

On the same 41-skill fleet:

  • Before: 41 skills loaded (75,578 tokens overhead per API call), monthly_waste_usd: 136.04
  • After: 41 skills loaded (4,011 tokens overhead per API call), monthly_waste_usd: ~7.22

This is the correct order of magnitude. Users will now see the real top waste source (empty heartbeats) instead of spending hours archiving skills for negligible savings.

Notes

  • Pyright emits Python 3.9 "union syntax" warnings across the file, but those are pre-existing and unrelated to this change.
  • estimate_tokens_from_text already existed in shared.py — I just added it to the existing import line.

🤖 Generated with Claude Code

SkillBloat was measuring the full SKILL.md file per skill, inflating the
reported skill overhead by ~19x and producing false high-severity
findings (e.g. $136/mo skill_bloat on a fleet where /context shows the
real overhead is ~$7/mo).

Claude Code only loads each skill's YAML frontmatter (name +
description) into the session at startup. SKILL.md bodies are loaded
on demand when the user invokes the skill via the Skill tool.

Fix: add `_estimate_skill_frontmatter_tokens()` that parses and measures
only the frontmatter block. Falls back to the documented 100-token
default if no frontmatter is present.

Verified on a 41-skill fleet:
- Before: 75,578 tokens total (avg 1,843/skill)
- After:  4,011 tokens total (avg 97/skill)
- Matches Claude Code's own /context output.
@alexgreensh alexgreensh merged commit 66003a9 into alexgreensh:main Apr 8, 2026
1 check passed
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 8, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: fleet-auditor SkillBloat over-counts skill tokens by measuring full SKILL.md body instead of frontmatter

2 participants