Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 13 additions & 4 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,16 +219,18 @@ This ensures the agent can write to `/tmp` for tax document processing without r

### Environment Variables

**Agent Server (`packages/tax-processing/.env`):**
**Agent Server (`packages/tax-processing/.env`) - Local Development Only:**
```env
PORT=3001
FRONTEND_URL=http://localhost:3000
OPENAI_API_KEY=your-api-key-here

# Optional: Weights & Biases Weave for LLM tracing
# Optional: Weights & Biases Weave for LLM tracing (local dev only)
WANDB_API_KEY=your-wandb-api-key-here
```

**Note**: When running on Runloop devboxes, API keys must be configured as Runloop secrets, not local environment variables. The `OPENAI_API_KEY` and `WANDB_API_KEY` secrets are automatically injected into devboxes from the Runloop secret store.

**Frontend (`packages/frontend/.env.local`):**
```env
NEXT_PUBLIC_AGENT_URL=http://localhost:3001
Expand All @@ -244,7 +246,14 @@ GITHUB_TOKEN=your-token-here # Required for private repos

The tax agent integrates with [Weights & Biases Weave](https://docs.wandb.ai/weave/) for comprehensive LLM call tracing and monitoring.

**Setup:**
**Setup for Runloop Devboxes:**
1. Get your W&B API key from https://wandb.ai/authorize
2. Add `WANDB_API_KEY` to Runloop secrets:
- Option A: Use the Runloop Settings page at https://platform.runloop.ai/settings
- Option B: Run `pnpm step1_runloop_setup` - it will prompt you to add the secret
3. When the agent runs on a devbox, Weave initializes automatically if the secret is present

**Setup for Local Development:**
1. Get your W&B API key from https://wandb.ai/authorize
2. Add `WANDB_API_KEY=your-key` to `packages/tax-processing/.env`
3. Start the agent server - Weave initializes automatically
Expand All @@ -262,7 +271,7 @@ The tax agent integrates with [Weights & Biases Weave](https://docs.wandb.ai/wea
- Explore traces, latency distributions, and token usage

**Configuration:**
The CodexService automatically initializes Weave if `WANDB_API_KEY` is present. Look for:
The CodexService automatically initializes Weave if `WANDB_API_KEY` is present in the environment (from Runloop secrets on devboxes, or from local `.env` file). Look for:
- `[CodexService] Weave tracing initialized successfully` - Weave is active
- `[CodexService] Weave tracing disabled: WANDB_API_KEY not set` - Running without Weave

Expand Down
41 changes: 25 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,9 @@ best. We can perform quick experiments to measure performance after making chang

c. Create a Secret for the OpenAI key you just generated in the [https://platform.runloop.ai/settings](settings) page. Name the secret name `OPENAI_API_KEY` and paste the key value from the OpenAI site.

d. Now configure your environment:
d. (Optional) Create a Secret for W&B API key for Weave LLM tracing. Get your W&B API key from [https://wandb.ai/authorize](https://wandb.ai/authorize), then create a secret named `WANDB_API_KEY` in the [https://platform.runloop.ai/settings](settings) page. This enables Weave tracing on Runloop devboxes. Note: The `pnpm step1_runloop_setup` script will also prompt you for this.

e. Now configure your environment:

```bash
export RUNLOOP_API_KEY=<your_runloop_api_key_here>
Expand All @@ -73,7 +75,7 @@ best. We can perform quick experiments to measure performance after making chang

Open the `.env` file and update values where prompted.

e. Launch the environment to test the setup:
f. Launch the environment to test the setup:

```bash
pnpm dev
Expand All @@ -85,27 +87,34 @@ best. We can perform quick experiments to measure performance after making chang

3. **(Optional) Enable Weave Tracing**

To track and monitor LLM calls with Weights & Biases Weave:

a. Get your W&B API key from [https://wandb.ai/authorize](https://wandb.ai/authorize)

b. Add the API key to `packages/tax-processing/.env`:
The tax agent integrates with [Weights & Biases Weave](https://docs.wandb.ai/weave/) for comprehensive LLM call tracing and monitoring.

```bash
WANDB_API_KEY=your-wandb-api-key-here
```
**For Local Development:**
- Get your W&B API key from [https://wandb.ai/authorize](https://wandb.ai/authorize)
- Add the API key to `packages/tax-processing/.env`:
```bash
WANDB_API_KEY=your-wandb-api-key-here
```
- When the agent server starts locally, you'll see:
- Success: `[CodexService] Weave tracing initialized successfully`
- Disabled: `[CodexService] Weave tracing disabled: WANDB_API_KEY not set`

c. When the agent server starts, you'll see:
- Success: `[CodexService] Weave tracing initialized successfully`
- Disabled: `[CodexService] Weave tracing disabled: WANDB_API_KEY not set`
**For Runloop Devboxes:**
- The `WANDB_API_KEY` must be configured as a Runloop secret (see step 2d above)
- The `pnpm step1_runloop_setup` script will prompt you to add this secret
- Weave automatically initializes when the agent runs on a devbox if the secret is present

d. View traces in your Weave dashboard at [https://wandb.ai/](https://wandb.ai/)
**Viewing Traces:**
- Navigate to [https://wandb.ai/](https://wandb.ai/)
- Select project: `tax-preparation-agent`
- Explore traces, latency distributions, and token usage

Weave automatically captures:
- All OpenAI API calls made through the Codex SDK
- Input prompts and output responses
- Token usage and latency metrics
- Complete input prompts and output responses
- Token usage, latency, and cost metrics
- Error traces and debugging information
- Agent tool usage (taxctl commands)

**Note**: Weave is completely optional. The system works without it.

Expand Down
8 changes: 6 additions & 2 deletions packages/frontend/src/lib/tax-processing-service.ts
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,12 @@ export class TaxProcessingService {
CODEX_SKIP_GIT_REPO_CHECK: 'true',
RUNLOOP_DEVBOX: '1',
},
// wire in the OpenAI key from the Runloop secret store
secrets: { OPENAI_API_KEY: 'OPENAI_API_KEY' },
secrets: {
// wire in the OpenAI key from the Runloop secret store
OPENAI_API_KEY: 'OPENAI_API_KEY',
// optionally wire in the WANDB key from the Runloop secret store
// WANDB_API_KEY: 'WANDB_API_KEY',
},
});

logger.log(`Devbox created: ${this.devbox.id}`);
Expand Down
1 change: 1 addition & 0 deletions packages/scripts/src/harness-lib-async.ts
Original file line number Diff line number Diff line change
Expand Up @@ -297,6 +297,7 @@ export async function startBenchmarkRun(
],
secrets: {
OPENAI_API_KEY: 'OPENAI_API_KEY',
WANDB_API_KEY: 'WANDB_API_KEY',
},
} as any,
});
Expand Down
1 change: 1 addition & 0 deletions packages/scripts/src/harness-lib.ts
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,7 @@ async function startScenarioRun(
],
secrets: {
OPENAI_API_KEY: 'OPENAI_API_KEY',
WANDB_API_KEY: 'WANDB_API_KEY',
},
} as any,
});
Expand Down
77 changes: 77 additions & 0 deletions packages/scripts/src/step1-runloop-setup.ts
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,78 @@ async function installOpenaiKey(runloop: RunloopSDK): Promise<void> {
}
}

/**
* Install the W&B API key in the Runloop secret store (optional).
* This enables Weave LLM tracing on Runloop devboxes.
*/
async function installWandbKey(runloop: RunloopSDK): Promise<void> {
try {
// Check if secret already exists
const secretsList = await runloop.api.secrets.list();
const existingSecret = secretsList.secrets.find(
(s) => s.name === 'WANDB_API_KEY'
);

if (existingSecret) {
console.log(
' ✓ WANDB_API_KEY already exists in Runloop secret store (skipping)'
);
return;
}

// Get default value from environment
const defaultValue = process.env.WANDB_API_KEY;

// Construct prompt message
let promptMessage =
'🔑 WANDB_API_KEY not found in Runloop secret store.\n Optional: Provide your W&B API key for Weave LLM tracing';
promptMessage += '\n (Get your key from https://wandb.ai/authorize)';

if (defaultValue) {
const last4 = defaultValue.slice(-4);
promptMessage += `\n (press Enter to use env var: ...${last4})`;
}
promptMessage += '\n (press Enter to skip): ';

// Prompt user interactively
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
});

const userInput = await new Promise<string>((resolve) => {
rl.question(promptMessage, (answer) => {
rl.close();
resolve(answer);
});
});

// Determine final value
const finalValue = userInput.trim() || defaultValue;

if (!finalValue) {
console.log(
' ⏭️ Skipping WANDB_API_KEY setup (Weave tracing will be disabled)'
);
return;
}

// Create secret in Runloop
await runloop.api.secrets.create({
name: 'WANDB_API_KEY',
value: finalValue,
});

console.log(' ✓ WANDB_API_KEY installed in Runloop secret store');
} catch (error) {
console.error(
`⚠️ Warning: Could not manage WANDB_API_KEY secret: ${error instanceof Error ? error.message : error}`
);
console.log(' Continuing without Weave tracing...');
// Don't exit - this is optional
}
}

async function main() {
console.log('🚀 Starting Step 1 Runloop Setup...\n');

Expand Down Expand Up @@ -145,6 +217,11 @@ async function main() {
await installOpenaiKey(runloop);
console.log('');

// Install W&B API key in secret store (optional)
console.log('🔑 Checking for WANDB_API_KEY in Runloop secret store...\n');
await installWandbKey(runloop);
console.log('');

// Check if resources already exist on Runloop
console.log('🔍 Checking for existing resources on Runloop...\n');

Expand Down
6 changes: 5 additions & 1 deletion packages/scripts/src/step1-start-devbox.ts
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,11 @@ async function main() {
CODEX_SKIP_GIT_REPO_CHECK: 'true',
RUNLOOP_DEVBOX: '1', // Tell the agent it's running on a devbox
},
secrets: { OPENAI_API_KEY: 'OPENAI_API_KEY' },
secrets: {
OPENAI_API_KEY: 'OPENAI_API_KEY',
// optionally wire in the WANDB key from the Runloop secret store
// WANDB_API_KEY: 'WANDB_API_KEY',
},
launch_parameters: {
keep_alive_time_seconds: 86400,
// after_idle: { idle_time_seconds: 3600, on_idle: "suspend" } // exclusive with keep_alive_time_seconds
Expand Down
6 changes: 5 additions & 1 deletion packages/scripts/src/step1-start-prebuild-devbox.ts
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,11 @@ async function main() {
CODEX_SKIP_GIT_REPO_CHECK: 'true',
RUNLOOP_DEVBOX: '1', // Tell the agent it's running on a devbox
},
secrets: { OPENAI_API_KEY: 'OPENAI_API_KEY' },
secrets: {
OPENAI_API_KEY: 'OPENAI_API_KEY',
// optionally wire in the WANDB key from the Runloop secret store
// WANDB_API_KEY: 'WANDB_API_KEY',
},
launch_parameters: {
keep_alive_time_seconds: 86400,
},
Expand Down
6 changes: 5 additions & 1 deletion packages/scripts/src/step3-generate-expected-outputs.ts
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,11 @@ async function generateExpectedOutput(
CODEX_SKIP_GIT_REPO_CHECK: 'true',
RUNLOOP_DEVBOX: '1',
},
secrets: { OPENAI_API_KEY: 'OPENAI_API_KEY' },
secrets: {
OPENAI_API_KEY: 'OPENAI_API_KEY',
// By step 3, tracing is beneficial so we'll require the WANDB key
WANDB_API_KEY: 'WANDB_API_KEY',
},
});

try {
Expand Down
Loading