fix: skip media-only messages from summarization pipeline#130
Open
semiok wants to merge 1 commit intoMartian-Engineering:mainfrom
Open
fix: skip media-only messages from summarization pipeline#130semiok wants to merge 1 commit intoMartian-Engineering:mainfrom
semiok wants to merge 1 commit intoMartian-Engineering:mainfrom
Conversation
Messages containing only media attachments (no meaningful text) produce near-empty source text that wastes summarizer API calls every compaction cycle. Filter out messages with fewer than 50 characters of actual text content after stripping MEDIA:/ file path references. Closes Martian-Engineering#124 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Messages containing only media attachments (e.g., an image with no accompanying text) produce near-empty source text (~28 tokens of
MEDIA:/...file path metadata) that cannot be meaningfully compressed by the summarizer. These messages waste an API call every compaction cycle and can produce garbage summaries.isMediaOnlyContent()helper that stripsMEDIA:/file path references and checks if remaining text is below 50 charactersleafPass()null) with a warning logmessageContentsarray — no metadata is lostObserved instance (from issue #124)
sum_66d17b5e39bd7460(34 tokens): a media-only message (MEDIA:/...png+ timestamp) with only 84 chars oftext_contenthit the fallback path and produced a garbage summary.Test plan
skipping leaf chunk: all N messages are media-onlywhen applicableCloses #124
🤖 Generated with Claude Code