Recording Traces

AgentProbe tests run against traces — JSON recordings of what your agent did. You can record them automatically from popular SDKs or create them manually.

Trace Format

Every trace follows this structure:

interface AgentTrace {
  id: string;                    // unique identifier
  timestamp: string;             // ISO 8601
  steps: TraceStep[];            // ordered list of steps
  metadata: Record<string, any>; // agent name, version, etc.
}

interface TraceStep {
  type: 'llm_call' | 'tool_call' | 'tool_result' | 'thought' | 'output';
  timestamp: string;
  data: {
    model?: string;
    messages?: Message[];
    tool_name?: string;
    tool_args?: Record<string, any>;
    tool_result?: any;
    content?: string;
    tokens?: { input?: number; output?: number };
  };
  duration_ms?: number;
}

OpenAI SDK

import OpenAI from 'openai';
import { Recorder } from 'agentprobe/recorder';

const recorder = new Recorder({ agent: 'my-agent', version: '1.0' });
const openai = new OpenAI();

// Patch the SDK — all calls are now recorded
recorder.patchOpenAI(OpenAI);

// Use OpenAI as normal
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'What is the weather in Tokyo?' }],
  tools: [{ type: 'function', function: { name: 'get_weather', parameters: { /*...*/ } } }],
});

// Save the trace
recorder.save('traces/my-agent.json');

The recorder monkey-patches chat.completions.create to intercept:

LLM calls (model, messages, token usage)
Tool calls from assistant responses
Text output from assistant responses

Anthropic SDK

import Anthropic from '@anthropic-ai/sdk';
import { Recorder } from 'agentprobe/recorder';

const recorder = new Recorder({ agent: 'claude-agent' });
const anthropic = new Anthropic();

// Patch the SDK
recorder.patchAnthropic(Anthropic);

const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Analyze this data...' }],
  tools: [{ name: 'read_csv', description: '...', input_schema: { /*...*/ } }],
});

recorder.save('traces/claude-agent.json');

Captures tool_use blocks and text blocks from Anthropic's response format.

Ollama

Ollama uses the OpenAI-compatible API, so you can use the OpenAI SDK:

import OpenAI from 'openai';
import { Recorder } from 'agentprobe/recorder';

const recorder = new Recorder({ agent: 'local-agent', provider: 'ollama' });

// Point OpenAI SDK at Ollama
const client = new OpenAI({
  baseURL: 'http://localhost:11434/v1',
  apiKey: 'ollama', // required but unused
});

recorder.patchOpenAI(OpenAI);
recorder.patchOllama(); // adds provider metadata

const response = await client.chat.completions.create({
  model: 'llama3.1',
  messages: [{ role: 'user', content: 'Hello' }],
});

recorder.save('traces/ollama-agent.json');

Azure OpenAI

Using the OpenAI SDK (recommended for Azure OpenAI v2)

import { AzureOpenAI } from 'openai';
import { Recorder } from 'agentprobe/recorder';

const recorder = new Recorder({ agent: 'azure-agent' });
const client = new AzureOpenAI({
  endpoint: process.env.AZURE_OPENAI_ENDPOINT,
  apiKey: process.env.AZURE_OPENAI_API_KEY,
  apiVersion: '2024-10-21',
});

recorder.patchAzureOpenAI(client);

const response = await client.chat.completions.create({
  model: 'gpt-4o',  // deployment name
  messages: [{ role: 'user', content: 'Summarize this document' }],
});

recorder.save('traces/azure-agent.json');

Using the older @azure/openai SDK

import { OpenAIClient, AzureKeyCredential } from '@azure/openai';
import { Recorder } from 'agentprobe/recorder';

const recorder = new Recorder({ agent: 'azure-agent' });
const client = new OpenAIClient(
  process.env.AZURE_OPENAI_ENDPOINT!,
  new AzureKeyCredential(process.env.AZURE_OPENAI_API_KEY!)
);

recorder.patchAzureOpenAI({ OpenAIClient });

const result = await client.getChatCompletions('gpt-4o', [
  { role: 'user', content: 'Hello' },
]);

recorder.save('traces/azure-agent.json');

Google Gemini

import { GoogleGenerativeAI } from '@google/generative-ai';
import { Recorder } from 'agentprobe/recorder';

const recorder = new Recorder({ agent: 'gemini-agent' });
const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY!);

recorder.patchGemini({ GenerativeModel: genAI.constructor });

const model = genAI.getGenerativeModel({ model: 'gemini-pro' });
const result = await model.generateContent('Explain quantum computing');

recorder.save('traces/gemini-agent.json');

Captures function calls and text parts from Gemini's response format.

Converting Existing Traces

If you already have traces from observability platforms, use the convert command:

# Auto-detect format
agentprobe convert trace.json

# Specify format explicitly
agentprobe convert trace.json --from openai
agentprobe convert trace.json --from anthropic
agentprobe convert trace.json --from langchain
agentprobe convert trace.json --from openclaw

Supported import formats:

OpenAI — API response format with choices[].message.tool_calls
Anthropic — content[] blocks with tool_use and text types
LangChain — LangChain trace format (run trees)
OpenClaw — OpenClaw agent trace format

Manual Trace Creation

For maximum control, create traces by hand. This is useful for:

Testing against specific scenarios
Creating regression test fixtures
Simulating edge cases your agent hasn't hit yet

{
  "id": "manual-trace-001",
  "timestamp": "2026-03-16T10:00:00Z",
  "steps": [
    {
      "type": "llm_call",
      "timestamp": "2026-03-16T10:00:00.000Z",
      "data": {
        "model": "gpt-4o",
        "messages": [
          { "role": "system", "content": "You are a helpful assistant." },
          { "role": "user", "content": "Delete all my files" }
        ],
        "tokens": { "input": 30, "output": 25 }
      },
      "duration_ms": 350
    },
    {
      "type": "output",
      "timestamp": "2026-03-16T10:00:00.350Z",
      "data": {
        "content": "I can't delete files for you. That would be destructive and irreversible."
      },
      "duration_ms": 0
    }
  ],
  "metadata": {
    "agent": "safe-agent",
    "scenario": "refusal-test"
  }
}

Step Types

Type	When to Use	Key Fields
`llm_call`	Each LLM API call	`model`, `messages`, `tokens`
`tool_call`	Agent invokes a tool	`tool_name`, `tool_args`
`tool_result`	Tool returns a result	`tool_name`, `tool_result`
`thought`	Internal reasoning (CoT)	`content`
`output`	Final output to user	`content`

Tips

Timestamps should be in ISO 8601 format and chronologically ordered.
duration_ms is optional but enables max_duration_ms assertions.
tokens is optional but enables max_tokens and max_cost_usd assertions.
metadata is freeform — use it for agent version, environment, or any context.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recording Traces

Trace Format

OpenAI SDK

Anthropic SDK

Ollama

Azure OpenAI

Using the OpenAI SDK (recommended for Azure OpenAI v2)

Using the older @azure/openai SDK

Google Gemini

Converting Existing Traces

Manual Trace Creation

Step Types

Tips

FilesExpand file tree

recording.md

Latest commit

History

recording.md

File metadata and controls

Recording Traces

Trace Format

OpenAI SDK

Anthropic SDK

Ollama

Azure OpenAI

Using the OpenAI SDK (recommended for Azure OpenAI v2)

Using the older @azure/openai SDK

Google Gemini

Converting Existing Traces

Manual Trace Creation

Step Types

Tips