Skip to content

observability(ai): measure searchOtherThreads hit rate (then tune or disable) #206

@JeffOtano

Description

@JeffOtano

Context

`convex/ai/coach.ts:86-93`:

```ts
searchOtherThreads: true,
searchOptions: {
limit: 10,
vectorSearch: true,
textSearch: true,
vectorScoreThreshold: 0.3,
messageRange: { before: 2, after: 1 },
},
```

Up to 40 extra messages from OTHER threads on every call. Vector embeddings are billed against `GOOGLE_GENERATIVE_AI_API_KEY` regardless of provider.

We don't know the actual hit rate — whether this is paying for itself or just noise. Fix: instrument first, tune second.

Files

  • `convex/ai/coach.ts:171-198` — contextHandler (`args.search` gives the search results)

Acceptance

  • Log a `searchHits` count + a `searchUsed` boolean to `aiUsage` every call
  • Collect 1-2 weeks of data
  • Dashboard: % of calls where search returned anything, % where search results actually entered context after windowing
  • Decision: if <20% hit rate, tighten `vectorScoreThreshold` or disable for first N messages of a thread

Related

Part of the broader context-token audit. Pair with #11 (context management) findings.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ai-agentWork on the AI coach agent systemobservabilityTelemetry, tracing, metricsphase: parking-lotDeferred until measurement or upstream work is readypriority: lowLow priority work

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions