UNICEF Technical Documentation Agent

The UNICEF Technical Documentation Agent is a FastAPI-based intelligent middleware that orchestrates communication between the frontend interface and multiple Model Context Protocol (MCP) servers. It serves as the central hub for processing natural language queries and coordinating responses from various data sources.

Core Components

LLM Integration: Configurable language model via LiteLLM
MCP Orchestration: Dynamic tool discovery and execution across data sources
Authentication: JWT-based security with user management
Streaming Responses: Real-time response delivery with tool call visibility
Observability: Langfuse integration for monitoring and feedback

Technology Stack

FastAPI: Modern Python web framework for API development
LlamaIndex: LLM application framework for workflow orchestration
LiteLLM: Unified interface for multiple LLM providers
Langfuse: LLM observability and analytics platform

Project Structure

agent/
├── agent.py              # Core agent workflow and LLM setup
├── server.py             # FastAPI application and endpoints
├── handlers.py           # Request processing and response streaming
├── auth.py               # Authentication and user management
├── initialize.py         # MCP client setup and tool initialization
├── config.py             # Configuration loading and validation
├── schemas.py            # Pydantic models and type definitions
├── prompts.yaml          # System prompts and instructions
├── config.yaml           # Application configuration
└── logging_config.py     # Logging setup and configuration

Getting Started

Prerequisites

Python 3.11 or higher
uv for dependency management
Access to OpenAI API or compatible LLM service
Running MCP servers (datawarehouse, RAG, GEE)

Installation

# Install dependencies using uv
uv sync

Configuration

agent/config.yaml:

llm:
  model: "gpt-4o-mini" # LLM model to use
  temperature: 0.5 # Response creativity (0-1)

mcp:
  datawarehouse_url: "http://datawarehouse_mcp:8001/sse"
  rag_url: "http://rag_mcp:8002/sse"
  geospatial_url: "http://geospatial_mcp:8003/sse"

server:
  host: "0.0.0.0" # Server bind address
  port: 8000 # Server port

Development

Start the server:
```
uv run agent/server.py
```
Access the API:
- Server: http://localhost:8000
- Health check: http://localhost:8000/

Testing

# Run all tests
uv run pytest

# Run specific test file
uv run pytest tests/test_agent.py -v

Environment Variables

The application loads secrets from Docker secret files or environment variables:

Variable	Description	Required
`OPENAI_API_KEY`	OpenAI API key for LLM access	Yes
`LANGFUSE_HOST`	Langfuse server URL	Yes
`LANGFUSE_PUBLIC_KEY`	Langfuse public key	Yes
`LANGFUSE_SECRET_KEY`	Langfuse secret key	Yes
`JWT_SECRET_KEY`	Secret for JWT token signing	Yes
`USERS`	JSON string with user credentials	Yes
`GOOGLE_APPLICATION_CREDENTIALS`	Path to Google service account JSON	Optional

User Management

Users are configured via JSON in the USERS environment variable:

[
  {
    "username": "admin",
    "hashed_password": "hashed_password" # pragma: allowlist secret
  }
]

API Documentation

Authentication

POST /token

curl -X POST "http://localhost:8000/token" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "username=admin&password=your_password"

Response:

{
  "access_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9...",
  "token_type": "bearer",
  "username": "admin"
}

Main Query Endpoint

POST /api/ask

curl -X POST "http://localhost:8000/api/ask" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "chat_messages": [
      {"content": "How many children are at risk of floods in Colombia?", "role": "user"}
    ],
    "session_id": "unique-session-id"
  }'

Streaming Response Format:

The agent will stream a response with the following format. Each response chunk follows the ReturnChunk schema with these fields:

trace_id (string): Unique identifier for tracking the request throughout the processing pipeline
response (string): The actual text content being streamed to the user (empty for non-text chunks)
is_thinking (boolean): Indicates if this is part of the agent's reasoning process (true) or final response (false)
tool_call (string): Details of backend tool operations being performed (empty when not calling tools)
is_finished (boolean): Indicates if this is the final chunk in the stream
html_content (string): HTML content for map visualizations or rich media (empty for text-only responses)

{"response": "I'll help you find information about flood risks...", "is_thinking": true, "trace_id": "abc123", "is_finished": false}
{"tool_call": {"name": "get_dataset_image", "args": {...}}, "trace_id": "abc123"}
{"response": "Based on the analysis...", "is_thinking": false, "trace_id": "abc123", "is_finished": true}
{"html_content": "<html>...</html>", "trace_id": "abc123"}

MCP Integration

Tool Discovery

The agent automatically discovers and loads tools from all configured MCP servers:

# Tools are loaded asynchronously from each MCP server
datawarehouse_tools = await datawarehouse_mcp.to_tool_list_async()
rag_tools = await rag_mcp.to_tool_list_async()
geospatial_tools = await geospatial_mcp.to_tool_list_async()

Available Tool Categories

Data Warehouse Tools:
- get_available_dataflows(): List available statistical datasets
- get_all_indicators_for_dataflow(dataflow_id): Get indicators for a dataset
- get_data_for_dataflow(...): Query specific data with filters
RAG Tools:
- get_ccri_relevant_information(question): Search technical documentation
Geospatial Tools (12 tools):
- Dataset and metadata operations
- Image processing and analysis
- Feature collection operations
- Map generation and visualization

Security

Authentication Flow

User submits credentials to /token endpoint
Server validates against configured users
JWT token issued with configurable expiration
Token required for all /api/* endpoints
Token validated on each request

Security Best Practices

Password Storage: SHA256 hashing
Token Security: JWT with secret-based signing
Network Security: Internal-only MCP communication
Input Validation: Pydantic schemas for all inputs
Error Handling: No sensitive information in error responses

Monitoring & Observability

Langfuse Integration

Trace Collection: All LLM interactions tracked
Performance Monitoring: Response times and token usage
User Feedback: Integration with frontend feedback system
Error Tracking: Detailed error logs and metrics

Contributing

Code Style: Follow PEP 8 and use type hints
Testing: Add tests for new functionality
Documentation: Update README and docstrings

Development Workflow

Create feature branch
Add tests for new functionality
Ensure all tests pass: uv run pytest
Update documentation
Submit pull request

License

This project is licensed under the MIT License. See the LICENSE file for details.

Support

Issues: Submit issues on GitHub
Security: Report security issues privately to maintainers

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github		.github
.vscode		.vscode
agent		agent
docs		docs
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
.secrets.baseline		.secrets.baseline
Dockerfile		Dockerfile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

UNICEF Technical Documentation Agent

Core Components

Technology Stack

Project Structure

Getting Started

Prerequisites

Installation

Configuration

Development

Testing

Environment Variables

User Management

API Documentation

Authentication

Main Query Endpoint

MCP Integration

Tool Discovery

Available Tool Categories

Security

Authentication Flow

Security Best Practices

Monitoring & Observability

Langfuse Integration

Contributing

Development Workflow

License

Support

About

Uh oh!

Releases

Packages

Languages

tryolabs/unicef-agent

Folders and files

Latest commit

History

Repository files navigation

UNICEF Technical Documentation Agent

Core Components

Technology Stack

Project Structure

Getting Started

Prerequisites

Installation

Configuration

Development

Testing

Environment Variables

User Management

API Documentation

Authentication

Main Query Endpoint

MCP Integration

Tool Discovery

Available Tool Categories

Security

Authentication Flow

Security Best Practices

Monitoring & Observability

Langfuse Integration

Contributing

Development Workflow

License

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages