These examples demonstrate how to use LaunchDarkly's judge functionality to evaluate AI responses for accuracy, relevance, and other metrics.
- Python 3.10 or higher
- Poetry installed
- A LaunchDarkly account with an AI Config created for chat functionality
- A Judge Config created for evaluation
- API keys for the provider you want to use (OpenAI, Bedrock, or Gemini)
-
Create a
.envfile in this directory with the following variables:LAUNCHDARKLY_SDK_KEY=your-launchdarkly-sdk-key LAUNCHDARKLY_AI_CONFIG_KEY=sample-ai-config LAUNCHDARKLY_AI_JUDGE_KEY=sample-ai-judge-accuracyLAUNCHDARKLY_AI_CONFIG_KEYdefaults tosample-ai-configif not set.LAUNCHDARKLY_AI_JUDGE_KEYdefaults tosample-ai-judge-accuracyif not set.Add the API key for your chosen provider:
OPENAI_API_KEY=your-openai-api-key -
Install the required dependencies:
poetry install
Uses the chat functionality which automatically evaluates responses with any judges defined in the AI config.
poetry run chat-judge-exampleEvaluates specific input/output pairs using a judge configuration directly.
poetry run direct-judge-example