feat: add transcript GC maintenance for summarized tool results by jalehman · Pull Request #148 · Martian-Engineering/lossless-claw

jalehman · 2026-03-20T23:58:21Z

What

Add the first runtime-assisted transcript GC pass for summarized externalized tool results, and include the repo spec that explains the broader design and current implementation status.

Why

Lossless Claw already bounded model context growth, but long-lived tool-heavy sessions could still accumulate oversized inline tool results in the active transcript. This change starts shrinking the hot JSONL once those payloads are safely externalized and summarized, which reduces restart/bootstrap cost and repeated replay of giant tool output after crashes. The added spec documents what is already done versus what still remains.

Changes

Add conservative maintain() transcript rewrite flow
GC only summarized externalized tool results
Match transcript entries by unique toolCallId
Rebuild compact replacements from stored message parts
Add focused transcript-GC unit coverage
Add repo spec for externalization/bootstrap/GC design
Add changeset for release notes

Testing

npx vitest run test/engine.test.ts -t "(lists summarized externalized tool results as transcript GC candidates|maintain\(\) requests transcript rewrites for summarized externalized tool results|externalizes oversized tool-result payloads into large_files|externalizes oversized plain-text tool-result blocks from live exec-style messages)"
npx tsc -p tsconfig.json --noEmit is still not a clean local gate because of pre-existing repo baseline issues and stale installed openclaw typings

Follow-ups

This is intentionally a first pass, not the full end state. The remaining work is tracked in the repo spec at specs/tool-result-externalization-and-incremental-bootstrap.md:

handle legacy inline oversized tool results that predate ingest-time externalization
strengthen transcript-entry alignment beyond unique toolCallId
tighten fresh-tail and eligibility rules for GC
add end-to-end coverage against the merged OpenClaw maintenance lifecycle
optionally add more preventive write-time hygiene so giant inline tool blobs are avoided earlier

…l results Add a summarized-tool candidate query in SummaryStore and implement LcmContextEngine.maintain() for the conservative first transcript-GC pass. This pass only rewrites tool-result transcript entries that were already externalized into large_files during ingest, are linked through summary_messages, and are no longer present as raw context items. Rebuild replacement toolResult messages from stored message_parts, align them to transcript entries by stable toolCallId, and request runtime-owned rewrites in small batches. Also export the minimal assembler helpers needed for replacement reconstruction and add focused engine tests for candidate selection and maintain()-driven rewrite requests. Regeneration-Prompt: | Implement Phase 2 of the tool-result externalization spec now that upstream OpenClaw has merged the transcript maintenance hook and rewrite helper. Keep this first pass conservative and additive: do not redesign compaction or add new schema unless required. Select transcript-GC candidates from LCM state only when a tool-result message was already externalized into large_files, is covered by summaries, and is no longer present as a raw context item. Rebuild the compact replacement message from stored message_parts so the placeholder content stays canonical, then align candidates to active transcript entries by stable toolCallId and ask the runtime to rewrite them in bounded batches. Skip anything ambiguous instead of trying to be clever. Add focused tests that prove candidate discovery works and that maintain() requests the expected rewrite payload for a summarized externalized tool result.

Document the current state of tool-result externalization, incremental bootstrap, and transcript GC in the repo spec. Add a changeset for the new runtime-assisted transcript GC behavior so release notes capture the user-visible impact. Regeneration-Prompt: | OpenClaw upstream landed the transcript rewrite maintenance API, and this branch already implements the first pass of transcript GC for summarized externalized tool results. Add the missing repo-side documentation so the PR is self-contained: a spec in specs/ that explains what is already implemented, why it matters operationally, and what still remains to finish the design. Also add a changeset, because this changes user-visible runtime behavior by shrinking active transcripts after safe condensation. Do not pretend the implementation is complete; call out the remaining work explicitly, including legacy inline tool results, stronger transcript alignment, tighter eligibility/fresh-tail rules, and end-to-end integration coverage.

jalehman added 2 commits March 20, 2026 16:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add transcript GC maintenance for summarized tool results#148

feat: add transcript GC maintenance for summarized tool results#148
jalehman wants to merge 2 commits intomainfrom
codex/lossless-claw-71a-transcript-gc-candidates

jalehman commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jalehman commented Mar 20, 2026

What

Why

Changes

Testing

Follow-ups

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant