fix: prevent fallback truncation summaries in compaction by cryptomaltese · Pull Request #79 · Martian-Engineering/lossless-claw

cryptomaltese · 2026-03-15T03:36:28Z

Lossless-Claw PR: Fallback Summarization Fix

Overview

This PR addresses a critical bug in the lossless-claw plugin where the LLM summarizer failures (403 errors, timeouts, etc.) trigger a "fallback truncation" that creates useless ~909-token garbage summaries. These summaries pollute the context DB and provide no value for future turns.

The fix: Instead of truncating to garbage, skip compaction entirely and retry on the next turn. This forces the system to maintain meaningful context while waiting for the summarizer to recover.

Problem

Current (Broken) Behavior

When CompactionEngine.summarizeWithEscalation() encounters:

Summarizer failure (exception) or
Poor compression (aggressive mode still >= input tokens)

...it falls back to:

const truncated = sourceText.slice(0, FALLBACK_MAX_CHARS);  // 2048 chars
summaryText = `${truncated}\n[Truncated from ${inputTokens} tokens]`;
level = "fallback";

This creates:

Garbage summaries (~909 tokens, matching FALLBACK_MAX_CHARS / 4)
Loss of semantic content (just raw text prefix)
DB pollution (334+ of these identified in existing DB)
Canary marker ([Truncated from N tokens]) that flags the garbage

Impact

99+ summaries created in the last 2 weeks are fallback truncations
They consume ~90k tokens in the context DB (~313 bytes each × 334 summaries)
Future expansions / repairs must skip or fix these
Cascading parent condensed summaries may also be corrupted

Solution

1. Add `level` Column to `summaries` Table

File: src/db/migration.ts

ALTER TABLE summaries ADD COLUMN level TEXT NOT NULL DEFAULT 'normal' 
  CHECK (level IN ('normal', 'aggressive', 'fallback'))

Values:

'normal' (default) — summary created with normal mode (good compression)
'aggressive' — summary created with aggressive mode (aggressive but good compression)
'fallback' — deprecated escalation — should never be set by new code

Backfill:

Query message_parts with part_type='compaction' for level='fallback' events
Extract createdSummaryIds from metadata JSON
Update those summaries to level='fallback' in the DB
Scan remaining summaries for [Truncated from canary and flag as fallback (future work)

Migration flow:

runLcmMigrations()
  → ensureSummaryLevelColumn()
  → backfillSummaryLevels()      ← NEW
  → ensureSummaryMetadataColumns()
  → backfillSummaryDepths()
  → backfillSummaryMetadata()

2. Fix Fallback Behavior in `compaction.ts`

File: src/compaction.ts

New summarizeWithEscalation() signature:

/**
 * Run two-level summarization escalation with explicit error handling:
 * normal → aggressive → fail (do NOT truncate to garbage).
 * 
 * Returns SummarizationAttempt with { failed: boolean, content?: string, level?: CompactionLevel }
 */
private async summarizeWithEscalation(params: {
  sourceText: string;
  summarize: CompactionSummarizeFn;
  options?: CompactionSummarizeOptions;
  logger?: { warn: (msg: string) => void };
}): Promise<SummarizationAttempt>

Behavior changes:

Phase 1: Normal mode
- Call summarize(sourceText, false, options)
- On error: Return { failed: true, error: "..." } immediately
- On success but poor compression: Proceed to Phase 2
Phase 2: Aggressive mode (only if normal didn't compress well)
- Call summarize(sourceText, true, options)
- On error: Return { failed: true, error: "..." } immediately
- On success but poor compression: Return { failed: true, error: "..." }
No Phase 3 (no fallback truncation)
- Delete the entire truncation path
- If compaction can't improve, the caller skips the compaction and retries next turn

Caller changes:

Both leafPass() and condensedPass() must handle null returns:

private async leafPass(...): Promise<{ ... } | null> {
  // ... build concatenated text ...
  
  const summary = await this.summarizeWithEscalation({ ... });
  
  // NEW: bail if summarization failed
  if (summary.failed || !summary.content) {
    return null;  // Caller sees: compaction incomplete, will retry
  }
  
  // ... persist summary ...
}

In compactLeaf() and compactFullSweep():

const leafResult = await this.leafPass(...);

// NEW: check for null
if (leafResult === null) {
  log.warn(`Leaf compaction failed; skipping and retrying next turn`);
  return {
    actionTaken: false,  // Signal: try again later
    tokensBefore,
    tokensAfter: tokensBefore,  // No progress
    condensed: false,
  };
}

3. Migration Script

File: src/db/migration.ts → function backfillSummaryLevels()

function backfillSummaryLevels(db: DatabaseSync): void {
  // Extract from message_parts.metadata
  // Query: SELECT part_id, metadata FROM message_parts 
  //        WHERE part_type = 'compaction' AND metadata IS NOT NULL
  
  // For each row:
  //   1. Parse metadata JSON
  //   2. Check if level === 'fallback'
  //   3. Extract createdSummaryIds array
  //   4. UPDATE summaries SET level = 'fallback' WHERE summary_id IN (...)
  
  // Best-effort: swallow errors, don't block migration
}

4. `lcm repair` Command

File: src/tools/lcm-repair-command.ts → LcmRepairEngine

Responsibility:

Find summaries with level='fallback' or truncation canary
Re-summarize using the same prompts (from summarize.ts)
Update DB with new content, set level='normal'
Log repairs as compaction events
Cascade check: after fixing leaves, verify parents are still valid

Public interface:

async repair(
  input: LcmRepairInput,
  summarizeFn: LcmSummarizeFn
): Promise<LcmRepairResult>

Input:

{
  mode: "scan" | "repair",        // dry-run vs commit
  conversationId?: number,        // specific conversation
  maxSummaries?: number,          // batch size (default 10)
  verbose?: boolean               // detailed logs
}

Output:

{
  mode: "scan" | "repair",
  foundCount: number,             // total fallback summaries found
  repairedCount: number,          // successfully re-summarized
  failedCount: number,            // re-summarization failed
  skippedCount: number,           // scan mode: not attempted
  entries: RepairSummaryEntry[],  // candidates (up to maxSummaries)
  logs: string[],                 // detailed logs if verbose
  cascadeDepth: number            // parent re-condensing depth
}

Algorithm:

Find Phase:
- Query summaries WHERE level = 'fallback'
- Query summaries WHERE content LIKE '%[Truncated from%'
- Deduplicate by summary_id
- Enrich with lineage (children, parents)
Repair Phase (if mode='repair'):
- For each leaf: get source messages → reconstruct input → re-summarize
- For each condensed: get parents → reconstruct input → re-summarize
- Update summaries with new content, set level='normal'
- Log repair as compaction event in message_parts
Cascade Phase (future):
- After leaf repairs, find parent condensed summaries
- Check if they still compress parents effectively
- If not, re-condense (calls condensedPass() logic)

Integration:

Register as a tool in engine.ts:

tools.register({
  name: "lcm_repair",
  description: "Repair fallback truncation summaries",
  inputSchema: { ... },
  invoke: async (input: LcmRepairInput) => {
    const engine = new LcmRepairEngine(db, convStore, summStore, compEngine, config);
    return engine.repair(input, summarizeFn);
  }
});

Testing Checklist

Unit Tests

summarizeWithEscalation() returns { failed: true } on summarizer exception
summarizeWithEscalation() returns { failed: true } when aggressive mode fails
leafPass() returns null when summarization fails
condensedPass() returns null when summarization fails
compactLeaf() returns actionTaken: false when leafPass is null
compactFullSweep() skips compaction on leafPass/condensedPass failures

Integration Tests

Migration adds level column with CHECK constraint
Backfill finds fallback events in message_parts.metadata
Backfill updates level='fallback' on identified summaries
LcmRepairEngine finds 99+ fallback summaries in test DB
LcmRepairEngine re-summarizes leaf and condensed summaries
Repair sets level='normal' on fixed summaries
Repair logs compaction event with summary ID

Manual Testing

Create test scenario:

INSERT INTO summaries (summary_id, conversation_id, kind, depth, level, content, token_count, created_at)
VALUES ('test_fallback_1', 1, 'leaf', 0, 'fallback', 
        'Lorem ipsum...[Truncated from 5000 tokens]', 1024, datetime('now'));

Run repair in scan mode:

lcm repair --mode scan --conversationId 1 --verbose

Expected: finds 1+ summaries

Run repair in repair mode:
```
lcm repair --mode repair --conversationId 1 --maxSummaries 5
```
Expected: re-summarizes and updates DB

Verify DB:

SELECT summary_id, level, content, token_count FROM summaries 
WHERE summary_id = 'test_fallback_1';

Expected: level='normal', shorter content, lower token count

Implementation Notes

Why Not "Lazy Fallback"?

Some might suggest: "Just mark it fallback and move on, repair later."

Problems:

Corrupts the context with useless garbage immediately
Wastes tokens during expansion/assembly
Requires background repair job (operational overhead)
Breaks the promise: "LCM never loses semantic content"

Our approach:

Skip compaction when it can't improve (better for immediate context)
Repair existing fallbacks via tool (explicit, auditable, batch-friendly)

Token Accounting

FALLBACK_MAX_CHARS = 512 * 4 = 2048 chars
Fallback summaries ≈ 2048 / 4 = 512 tokens
But often shorter (truncated mid-word) → ~300–900 tokens
Level='fallback' + canary makes them detectable

Migration Safety

Add column with DEFAULT 'normal' → no existing rows affected
Backfill is best-effort (errors swallowed) → migration never blocks
ensureSummaryLevelColumn() checks for existing column → idempotent
Can run migration multiple times safely

Repair Cascading

Future work: after re-summarizing leaves, check parent condensed summaries:

Get all condensed parents of repaired leaves
Re-fetch their child summaries (now with new content)
Check if condensation still effective
If not, re-condense by calling condensedPass() logic

Currently stubbed as cascadeDepth: 0 placeholder.

Files Changed

src/db/migration.ts
- Add level column with CHECK constraint
- Implement ensureSummaryLevelColumn()
- Implement backfillSummaryLevels()
- Call backfill in runLcmMigrations()
src/compaction.ts
- New SummarizationAttempt type
- Refactor summarizeWithEscalation() with error handling
- Delete truncation fallback path
- Update leafPass() to return null on failure
- Update condensedPass() to return null on failure
- Update callers to handle null returns
src/tools/lcm-repair-command.ts (NEW)
- LcmRepairEngine class
- findFallbackSummaries()
- findTruncationCanaries()
- resummarizeLeaf()
- resummarizeCondensed()
- repair() main flow
src/engine.ts
- Register lcm_repair tool (one-liner)

Backcompat & Migration

Old code: Creates level='fallback' summaries (still works, just not created anymore)
New code: Never creates level='fallback' (skips compaction on failure)
Mixed environment: OK; old summaries just get level='fallback' during backfill
Rollback: Remove level column via down-migration (or just ignore it)

Performance Impact

No cost to compaction path (we skip it on failure, which is rare)
Repair tool: O(N) scan + O(K) re-summarization where K = maxSummaries
Migration: One-time scan of message_parts (fast)

Future Work

Cascade repairs (implement cascadeDepth logic in LcmRepairEngine)
Scheduled repair cron (background job runs nightly)
Repair metrics (track # of fallbacks, success rate, tokens saved)
Aggressive mode tuning (target ratios based on input size)

Status: Draft PR for review
Target: lossless-claw v0.3.1+
Reviewer: @maltese
Related: GitHub issue #334-fallback-summaries

octalmage · 2026-03-15T18:29:36Z

I've run into this a few time, makes it hard to judge if lcm is working at all.

Replace the three-level summarization escalation (normal → aggressive → deterministic truncation) with a two-level approach that returns null on failure instead of creating garbage summaries. When both normal and aggressive summarization fail to compress below the input token count, the compaction engine now bails and retries on the next turn. This prevents useless '[Truncated from N tokens]' summaries from polluting the DAG — particularly for media-only messages where the stored text content is just a file path (~28 tokens) that no LLM can compress further. Changes: - Remove truncation fallback in summarizeWithEscalation() — return null on compression failure (callers already handle null since upstream) - Wrap summarizer calls in try/catch to handle LLM errors gracefully - Add 'level' column to summaries table (normal/aggressive/fallback) with migration and backfill from compaction event metadata - Add SummaryRecord.level to store types - Add lcm_repair tool to scan/re-summarize existing fallback summaries - Register repair tool in plugin entry point

cryptomaltese force-pushed the fix/fallback-truncation-repair branch from 832133c to 6ac7145 Compare March 18, 2026 21:39

cryptomaltese force-pushed the fix/fallback-truncation-repair branch from 6ac7145 to e468037 Compare March 18, 2026 21:47

cryptomaltese mentioned this pull request Mar 18, 2026

Skip media-only messages from summarization pipeline #124

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent fallback truncation summaries in compaction#79

fix: prevent fallback truncation summaries in compaction#79
cryptomaltese wants to merge 1 commit intoMartian-Engineering:mainfrom
cryptomaltese:fix/fallback-truncation-repair

cryptomaltese commented Mar 15, 2026

Uh oh!

octalmage commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cryptomaltese commented Mar 15, 2026

Lossless-Claw PR: Fallback Summarization Fix

Overview

Problem

Current (Broken) Behavior

Impact

Solution

1. Add level Column to summaries Table

2. Fix Fallback Behavior in compaction.ts

3. Migration Script

4. lcm repair Command

Testing Checklist

Unit Tests

Integration Tests

Manual Testing

Implementation Notes

Why Not "Lazy Fallback"?

Token Accounting

Migration Safety

Repair Cascading

Files Changed

Backcompat & Migration

Performance Impact

Future Work

Uh oh!

octalmage commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Add `level` Column to `summaries` Table

2. Fix Fallback Behavior in `compaction.ts`

4. `lcm repair` Command