Skip to content

iron-token-broker: a credential with missing 1Password items poisons the shared SDK client ("invalid client id"), breaking rotation for all credentials #176

@lyoungblood

Description

@lyoungblood

Summary

iron-token-broker (built from cmd/iron-token-broker; image ironsh/iron-token-broker:0.0.1-rc.2) stops being able to read or rotate any credential once it is configured with one credential whose referenced 1Password items don't exist. The first 1Password operation succeeds, then every subsequent operation — for all credentials — fails with:

an internal error occurred: ... invalid client id

Removing the mis-provisioned credential restores the broker; re-adding it breaks it again. Confirmed empirically.

Impact

A single phantom / mis-provisioned credentials[] entry takes down token rotation for the whole broker, including healthy credentials in active use (the Items.Get / Items.Put write path in internal/broker/store/op.go). Since one broker is shared by multiple iron-proxy instances, this is a process-wide outage of refresh-token rotation.

Likely root cause

The read-side 1Password resolver keeps a single cached SDK client and reuses it for every credential's read-side fields (client_id, client_secret, token_endpoint_headers).

internal/transform/secrets/op_resolver.go:

// opClientCache lazily constructs a single SDK client. The 1P SDK loads a
// Wasm module on NewClient, so reusing the client across multiple secret
// entries is significantly cheaper than creating one per entry.
type opClientCache struct {
    mu     sync.Mutex
    client opClient
    ...
}

func (c *opClientCache) get(ctx context.Context) (opClient, error) {
    c.mu.Lock()
    defer c.mu.Unlock()
    if c.client != nil {
        return c.client, nil   // reused across every credential
    }
    ...
}

Resolution flows through opSDKClient.Resolveclient.Secrets().Resolve(ctx, ref).

When the broker resolves a credential whose item is absent (e.g. op://ai-agents/OPENAI_CODEX_CLIENT_ID/credential), that Secrets().Resolve errors — and after that error the cached client is left in a state where every later Resolve (including for a healthy credential like CLAUDE_CODE_CLIENT_ID) returns ... invalid client id. The invalid client id wording points at the 1Password Go SDK's process-global Wasm core / client registry being invalidated by the failed call (you maintain a fork: ironsh/onepassword-sdk-go), which would also explain why the per-handle store clients built in internal/broker/store/op.go ensureClientLocked are affected.

Reproduce

  1. Configure two credentials; for one, point client_id (and/or store) at 1Password items that don't exist (vault present, items absent).
  2. Start the broker; resolve / rotate tokens.
  3. The first op appears to succeed; every later op — including for the healthy credential — fails with an internal error occurred ... invalid client id.
  4. Remove the bad credential → broker healthy again. Re-add it → broken again.

Expected

A credential whose 1Password items can't be resolved should fail in isolation (that credential marked unauthenticated / errored) without invalidating the shared client used by every other credential.

Suggested fixes

  • Scope a resolve/Items failure to the offending credential: on detecting the invalid client id core state, discard and rebuild the cached client (opClientCache.client = nil) so healthy credentials recover, instead of reusing a poisoned client.
  • Investigate the underlying SDK core (ironsh/onepassword-sdk-go): a failed Secrets().Resolve / Items.* call should not invalidate the client handle for subsequent calls.
  • Optionally validate items at config load and surface a per-credential error while keeping the rest of the broker functional.

Context

Hit on a Paradigm Centaur deployment using ironsh/iron-token-broker:0.0.1-rc.2. Downstream mitigation that simply avoids handing the broker credentials for unprovisioned harnesses: paradigmxyz/centaur#398. Filing here because the robust fix is broker-/SDK-side.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions