Skip to content

Add memory supersede: atomic consolidate of multiple memories into one #8

@doctatortot

Description

@doctatortot

The problem

During heavy use of omem I keep running into this workflow gap:

  1. I store a long context as multiple small memories because of embedding context limits — e.g. seagate-zombie-luks part 1/3, part 2/3, part 3/3 (all-minilm-512 had 512-token ctx, forced ~1.2KB chunking).
  2. Later I switch to a bigger embedder (nomic-embed-text, 2048 ctx) and re-save the same content as a single consolidated memory.
  3. The three part-X-of-N fragments are still in the index. They show up in search, dilute results, and split attention.

The Memory schema already has superseded_by: Option<String> and a state field, but there's no API to set them — and no search filter that respects them.

This isn't hypothetical: I hit it three separate times in one session (seagate recovery, asd config, central-syslog) and ended up with 8+ orphaned fragments cluttering recall.

What I'd like to add

An atomic supersede operation that:

  1. Creates a new memory (the consolidated one)
  2. Marks N old memories as superseded by it (state = "superseded", superseded_by = <new_id>, superseded_at = now())
  3. Default search/list filters exclude superseded entries
  4. memory_get(id) still returns superseded memories (history preserved)
  5. Power users can opt back in via include_superseded: bool on search/list

Design questions for you before I code

Endpoint shape — A1 or A2?

A1. POST /v1/memories grows an optional replaces: ["id1", "id2", ...] field. One round-trip, atomic by design.

POST /v1/memories
{
  "content": "consolidated content...",
  "tags": ["seagate"],
  "replaces": ["part1_id", "part2_id", "part3_id"]
}

A2. Separate POST /v1/memories/:id/supersede (or batch variant). More orthogonal but two round-trips and a window where the new exists but olds still show up.

My lean is A1 — matches how the workflow actually plays out, and atomicity comes free.

State + timestamp

  • state = "superseded" (new enum variant alongside active)
  • superseded_by = <new_id> (existing field)
  • superseded_at = now() (new — cheap, useful for audit)

Idempotency / chains

Superseding an already-superseded memory: my call is reject with 409. Force the caller to either re-target the chain head or explicitly opt-in via a force_chain: bool flag. Quiet chain extension feels like a future-bug machine.

Filtering by default

  • memory_search / memory_list: filter state == "active" by default. Add include_superseded: bool (default false) to opt in.
  • memory_get(id): returns the memory regardless of state. The link is the link; you asked for it specifically.

Tag/source inheritance

My lean: pristine. The new memory's tags/source are whatever the caller passed; the replaces field is the only link to history. Caller can copy tags forward if they want.

Cross-tenant

Reject with 403. Obviously nonsensical, but worth being explicit in the validation.

Plugin tool surface

  • memory_store grows replaces: [...] to mirror the API
  • New memory_supersede(superseded_by, ids) tool for the rare bare-op case (no new memory, just retire some)

What this would touch

  • domain/memory.rs — add superseded_at, possibly a MemoryState enum if not already
  • store/lancedb.rsvector_search + fts_search get a include_superseded: bool filter; new method supersede_batch(new_id, ids)
  • api/handlers/memory.rs (or wherever POST /v1/memories lives) — accept replaces field, wrap creation + supersede in a single Lance transaction
  • retrieve/pipeline.rs — plumb the filter through SearchRequest
  • Plugin TS: memory_store schema gets replaces, new memory_supersede tool
  • Tests at each layer

Happy to do all of it. Wanted to surface the design first so we agree on shape before I sink time into a 250-line PR you'd send back for restructure.

Which direction on A1 vs A2? And anything I'm missing in the questions above?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions