Skip to content

[Feature]: Expose a rollback action in apply_knowledge #492

@cjimti

Description

@cjimti

Problem Statement

apply_knowledge records a changeset_id on every apply call and the response says the changeset is "recorded for rollback," but no action in the tool surface can actually perform the rollback. The action enum is bulk_review, review, synthesize, apply, approve, reject. There is no rollback / revert. An agent that makes a catalog change it needs to undo has no in-band way to do it and has to defer to an admin console action.

Motivation

Catalog writes through apply_knowledge are not always clean, and the tool's own response can overstate success. During a recent admin session:

  • An add_glossary_term apply against a dataset reported changes_applied: 2 when only one term actually persisted.
  • add_glossary_term turned out to have set / last-write-wins semantics on the glossaryTerms aspect rather than append, so the apply silently clobbered the dataset's existing canonical term.
  • Recovery was only possible by reading the entity back, reconstructing the prior state by hand, and re-applying it. The recorded changeset_id values were inert because nothing consumes them.

A rollback action would make the recorded changesets actually usable and close the loop on the "recorded for rollback" promise the tool already prints.

Proposed Solution

Add rollback to the apply_knowledge action enum.

  • Input: changeset_id (required). Optionally entity_urn for validation that the changeset belongs to the expected entity.
  • Effect: reverts the aspects mutated by that changeset to their pre-change state, using the before-image the changeset already captures.
  • Confirmation: require confirm: true, consistent with apply.
  • Idempotency / ordering: reject rollback of a changeset that has already been rolled back, and reject (or warn) when a later changeset has since mutated the same aspect, so a rollback cannot silently resurrect stale state on top of newer writes. Surface the conflict rather than guessing.
  • Insight lifecycle: if the changeset was produced by applying an insight, move that insight applied -> rolled_back (the lifecycle docs already reference a rolled_back state, so this wires an existing state to a real transition).
  • Audit: record the rollback itself as its own auditable event with a reference back to the original changeset_id.

Discovery surface

  • Add a way to list changesets for an entity (e.g. apply_knowledge action=list_changesets entity_urn=...) returning changeset_id, timestamp, actor, change types, and current applied / rolled_back status. Without this, an agent can only roll back a changeset whose id it happens to still hold from earlier in the same session.

Related hardening (separate but adjacent)

These came out of the same session and are worth tracking even if handled in their own issues:

  1. add_glossary_term is set, not add. Either make it genuinely additive (merge with existing terms) or rename it to reflect set semantics and document that the caller must pass the full intended term set. Today the verb is misleading and the failure mode is silent data loss.
  2. changes_applied can overcount. The returned count did not match what actually persisted. The count should reflect aspects that were verified written, or the response should include the resulting aspect state so callers can verify without a follow-up read.

Acceptance criteria

  • apply_knowledge action=rollback changeset_id=... confirm=true reverts the changeset and returns the restored state.
  • Rolling back an already-rolled-back changeset is rejected with a clear message.
  • Rollback that would overwrite a newer change to the same aspect is blocked or clearly flagged, not applied silently.
  • Insights tied to a rolled-back changeset transition to rolled_back.
  • The rollback is itself audit-logged with a pointer to the original changeset.
  • Changesets for an entity are discoverable without prior knowledge of their ids.

Alternatives Considered

No response

Scope

MCP Tools

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions