You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Mar 25, 2026. It is now read-only.
For TypeScript, create a \`scenario.config.mjs\` file:
2245
2199
\`\`\`typescript
2246
2200
// scenario.config.mjs
2247
-
import { defineConfig } from "@langwatch/scenario/config";
2201
+
import { defineConfig } from "@langwatch/scenario/integrations/vitest/config";
2248
2202
import { openai } from "@ai-sdk/openai";
2249
2203
2250
2204
export default defineConfig({
@@ -2479,78 +2433,11 @@ The MCP must be configured with your LangWatch API key.
2479
2433
- Do NOT use \`fetch_scenario_docs\` for SDK documentation — that's for code-based testing
2480
2434
- Write criteria as natural language descriptions, not regex patterns
2481
2435
- Create focused scenarios — each should test one specific behavior
2482
-
- Always call \`discover_schema\` first to understand the scenario format
2483
-
`,
2484
-
2485
-
platform_analytics: `You are helping me analyze my AI agent's performance using LangWatch.
2436
+
- Always call \`discover_schema\` first to understand the scenario format`,
2486
2437
2487
-
IMPORTANT: You will need my LangWatch API key. Ask me for it and direct me to https://app.langwatch.ai/authorize if I don't have one.
2438
+
recipe_debug_instrumentation: `Debug Your LangWatch Instrumentation
2488
2439
2489
-
## Setup
2490
-
2491
-
Install the LangWatch MCP server:
2492
-
claude mcp add langwatch -- npx -y @langwatch/mcp-server --apiKey <API_KEY>
2493
-
2494
-
## What to do
2495
-
2496
-
1. Call discover_schema with category "all" to learn available metrics
2497
-
2. Call get_analytics to query:
2498
-
- Total LLM cost (last 7 days)
2499
-
- P95 latency trends
2500
-
- Token usage over time
2501
-
- Error rates
2502
-
3. Use search_traces to find traces with errors or high latency
2503
-
4. Present the findings clearly with key numbers and anomalies`,
2504
-
2505
-
platform_scenarios: `You are helping me create scenario tests for my AI agent on the LangWatch platform.
2506
-
2507
-
IMPORTANT: You will need my LangWatch API key. Ask me for it and direct me to https://app.langwatch.ai/authorize if I don't have one.
2508
-
2509
-
## Setup
2510
-
2511
-
Install the LangWatch MCP server:
2512
-
claude mcp add langwatch -- npx -y @langwatch/mcp-server --apiKey <API_KEY>
2513
-
2514
-
## What to do
2515
-
2516
-
1. Call discover_schema with category "scenarios" to understand the format
2517
-
2. Create scenarios using platform_create_scenario for:
2518
-
- Happy path: normal, expected interactions
2519
-
- Edge cases: unusual inputs, unclear requests
2520
-
- Error handling: when things go wrong
2521
-
2522
-
For each scenario, define:
2523
-
- name: A descriptive name for the test case
2524
-
- situation: The context and user behavior to simulate
2525
-
- criteria: What the agent should do (list of success criteria)
2526
-
- labels: Tags for organization (optional)
2527
-
2528
-
3. Use platform_list_scenarios to review all scenarios
2529
-
4. Use platform_update_scenario to refine them
2530
-
2531
-
Write criteria as natural language descriptions, not regex patterns.
2532
-
Each scenario should test one specific behavior.`,
2533
-
2534
-
platform_evaluators: `You are helping me set up evaluators for my AI agent on the LangWatch platform.
2535
-
2536
-
IMPORTANT: You will need my LangWatch API key. Ask me for it and direct me to https://app.langwatch.ai/authorize if I don't have one.
2537
-
2538
-
## Setup
2539
-
2540
-
Install the LangWatch MCP server:
2541
-
claude mcp add langwatch -- npx -y @langwatch/mcp-server --apiKey <API_KEY>
2542
-
2543
-
## What to do
2544
-
2545
-
1. Call discover_schema with category "evaluators" to see available types
2546
-
2. Use platform_list_evaluators to see existing evaluators
2547
-
3. Create evaluators using platform_create_evaluator:
2548
-
- LLM-as-judge evaluators for quality assessment
2549
-
- Specific evaluator types matching your use case
2550
-
4. Use platform_get_evaluator and platform_update_evaluator to review and refine
2551
-
5. Then go to https://app.langwatch.ai to set up monitors using these evaluators`,
2552
-
2553
-
recipe_debug_instrumentation: `You are using LangWatch for your AI agent project. Follow these instructions.
2440
+
You are using LangWatch for your AI agent project. Follow these instructions.
2554
2441
2555
2442
IMPORTANT: You will need a LangWatch API key. Check if LANGWATCH_API_KEY is already in the project's .env file. If not, ask the user for it — they can get one at https://app.langwatch.ai/authorize. If they have a LANGWATCH_ENDPOINT in .env, they are on a self-hosted instance — use that endpoint instead of app.langwatch.ai.
2556
2443
First, try to install the LangWatch MCP server for access to documentation and platform tools. If installation fails, you can fetch docs directly via the URLs provided below.
@@ -2608,10 +2495,11 @@ After fixes, compare before/after:
2608
2495
| Spans not connected to traces | Missing \`@langwatch.trace()\` on entry function | Add trace decorator to the main function |
2609
2496
| No labels on traces | Labels not set in trace metadata | Add \`metadata={"labels": ["feature"]}\` to trace update |
2610
2497
| Missing user_id | User ID not passed to trace | Add \`user_id\` to trace metadata |
2611
-
| Traces from different calls merged | Missing \`langwatch.setup()\` or trace context not propagated | Ensure \`langwatch.setup()\` called at startup |
2612
-
`,
2498
+
| Traces from different calls merged | Missing \`langwatch.setup()\` or trace context not propagated | Ensure \`langwatch.setup()\` called at startup |`,
2613
2499
2614
-
recipe_improve_setup: `You are using LangWatch for your AI agent project. Follow these instructions.
2500
+
recipe_improve_setup: `Improve Your LangWatch Setup
2501
+
2502
+
You are using LangWatch for your AI agent project. Follow these instructions.
2615
2503
2616
2504
IMPORTANT: You will need a LangWatch API key. Check if LANGWATCH_API_KEY is already in the project's .env file. If not, ask the user for it — they can get one at https://app.langwatch.ai/authorize. If they have a LANGWATCH_ENDPOINT in .env, they are on a self-hosted instance — use that endpoint instead of app.langwatch.ai.
2617
2505
First, try to install the LangWatch MCP server for access to documentation and platform tools. If installation fails, you can fetch docs directly via the URLs provided below.
@@ -2680,10 +2568,11 @@ After each improvement:
2680
2568
- Do NOT skip the audit — you can't suggest improvements without understanding the current state
2681
2569
- Do NOT give generic advice — every suggestion must be specific to this codebase
2682
2570
- Do NOT overwhelm with 10 suggestions — pick the top 2-3
2683
-
- Do NOT skip running/verifying improvements
2684
-
`,
2571
+
- Do NOT skip running/verifying improvements`,
2572
+
2573
+
recipe_evaluate_multimodal: `Evaluate Your Multimodal Agent
2685
2574
2686
-
recipe_evaluate_multimodal: `You are using LangWatch for your AI agent project. Follow these instructions.
2575
+
You are using LangWatch for your AI agent project. Follow these instructions.
2687
2576
2688
2577
IMPORTANT: You will need a LangWatch API key. Check if LANGWATCH_API_KEY is already in the project's .env file. If not, ask the user for it — they can get one at https://app.langwatch.ai/authorize. If they have a LANGWATCH_ENDPOINT in .env, they are on a self-hosted instance — use that endpoint instead of app.langwatch.ai.
2689
2578
First, try to install the LangWatch MCP server for access to documentation and platform tools. If installation fails, you can fetch docs directly via the URLs provided below.
@@ -2773,10 +2662,11 @@ Run the evaluation, review results, fix issues, re-run until quality is acceptab
2773
2662
- Do NOT evaluate multimodal agents with text-only metrics — use image-aware judges
2774
2663
- Do NOT skip testing with real file formats — synthetic descriptions aren't enough
2775
2664
- Do NOT forget to handle file loading errors in evaluations
2776
-
- Do NOT use generic test images — use domain-specific ones matching the agent's purpose
2777
-
`,
2665
+
- Do NOT use generic test images — use domain-specific ones matching the agent's purpose`,
2778
2666
2779
-
recipe_generate_rag_dataset: `You are using LangWatch for your AI agent project. Follow these instructions.
2667
+
recipe_generate_rag_dataset: `Generate a RAG Evaluation Dataset
2668
+
2669
+
You are using LangWatch for your AI agent project. Follow these instructions.
2780
2670
2781
2671
IMPORTANT: You will need a LangWatch API key. Check if LANGWATCH_API_KEY is already in the project's .env file. If not, ask the user for it — they can get one at https://app.langwatch.ai/authorize. If they have a LANGWATCH_ENDPOINT in .env, they are on a self-hosted instance — use that endpoint instead of app.langwatch.ai.
2782
2672
First, try to install the LangWatch MCP server for access to documentation and platform tools. If installation fails, you can fetch docs directly via the URLs provided below.
@@ -2876,10 +2766,11 @@ Before using the dataset:
2876
2766
- Do NOT skip negative cases — testing "I don't know" is crucial for RAG
2877
2767
- Do NOT use the same question pattern for every entry — diversify types
2878
2768
- Do NOT forget to include the relevant context per row
2879
-
- Do NOT generate expected outputs that aren't actually in the knowledge base
2880
-
`,
2769
+
- Do NOT generate expected outputs that aren't actually in the knowledge base`,
2881
2770
2882
-
recipe_test_compliance: `You are using LangWatch for your AI agent project. Follow these instructions.
2771
+
recipe_test_compliance: `Test Your Agent's Compliance Boundaries
2772
+
2773
+
You are using LangWatch for your AI agent project. Follow these instructions.
2883
2774
2884
2775
IMPORTANT: You will need a LangWatch API key. Check if LANGWATCH_API_KEY is already in the project's .env file. If not, ask the user for it — they can get one at https://app.langwatch.ai/authorize. If they have a LANGWATCH_ENDPOINT in .env, they are on a self-hosted instance — use that endpoint instead of app.langwatch.ai.
2885
2776
First, try to install the LangWatch MCP server for access to documentation and platform tools. If installation fails, you can fetch docs directly via the URLs provided below.
@@ -3012,10 +2903,11 @@ Create reusable criteria for your domain:
3012
2903
- Do NOT only test with polite, straightforward questions — adversarial probing is essential
3013
2904
- Do NOT skip multi-turn escalation scenarios — single-turn tests miss persistence attacks
3014
2905
- Do NOT use weak criteria like "agent is helpful" — be specific about what it must NOT do
3015
-
- Do NOT forget to test the "empathetic but firm" response — the agent should show care while maintaining boundaries
3016
-
`,
2906
+
- Do NOT forget to test the "empathetic but firm" response — the agent should show care while maintaining boundaries`,
3017
2907
3018
-
recipe_test_cli_usability: `You are using LangWatch for your AI agent project. Follow these instructions.
2908
+
recipe_test_cli_usability: `Test Your CLI's Agent Usability
2909
+
2910
+
You are using LangWatch for your AI agent project. Follow these instructions.
3019
2911
3020
2912
IMPORTANT: You will need a LangWatch API key. Check if LANGWATCH_API_KEY is already in the project's .env file. If not, ask the user for it — they can get one at https://app.langwatch.ai/authorize. If they have a LANGWATCH_ENDPOINT in .env, they are on a self-hosted instance — use that endpoint instead of app.langwatch.ai.
3021
2913
First, try to install the LangWatch MCP server for access to documentation and platform tools. If installation fails, you can fetch docs directly via the URLs provided below.
@@ -3110,7 +3002,6 @@ Write scenarios where the agent makes a mistake and must recover:
3110
3002
- Do NOT output errors without actionable guidance (the agent needs to know how to fix it)
3111
3003
- DO make \`--help\` comprehensive on every subcommand
3112
3004
- DO use non-zero exit codes for failures (agents check exit codes)
3113
-
- DO output structured information (the agent can parse it)
3114
-
`,
3005
+
- DO output structured information (the agent can parse it)`,
0 commit comments