Step Functions AI Agent Framework

Enterprise-Grade Serverless AI Agent Platform

Build production-ready AI agents with complete flexibility in LLM providers and tools, backed by a comprehensive management UI for enterprise operations.

Overview

The Step Functions AI Agent Framework consists of two integrated components:

1. AI Agent Runtime (Lambda + Step Functions)

A serverless, highly flexible agent execution platform that provides:

Any LLM Provider: Anthropic Claude, OpenAI GPT, Google Gemini, Amazon Bedrock, xAI Grok, DeepSeek
Any Programming Language: Build tools in Python, TypeScript, Rust, Go, Java, or any language
Serverless Scale: Automatic scaling with AWS Step Functions orchestration
Complete Observability: Full tracing, metrics, and cost tracking built-in

2. Management UI (AWS Amplify)

A comprehensive admin interface for enterprise operations:

Agent Management: Configure agents, assign tools, update LLM models
Tool Registry: Manage and test tools across all agents
Execution Monitoring: Real-time execution history with filtering and search
Cost Analytics: Track usage and costs by agent, model, and time period
Enterprise Security: IAM-integrated access, secret management, audit logging

Key Features

Agent Framework

✅ Multi-Provider LLM Support - Switch providers without code changes
✅ Unified Rust LLM Service - High-performance, provider-agnostic interface
✅ Language-Agnostic Tools - Build tools in any language
✅ Human-in-the-Loop - Built-in approval workflows
✅ Modular Architecture - Shared infrastructure, reusable tools
✅ Long Content Support - Handle extensive documents and conversations

Management UI

📊 Execution Dashboard - Fast, indexed execution history with date/agent filtering
🔧 Agent Configuration - Dynamic system prompts, model selection, tool assignment
🧪 Integrated Testing - Test agents and tools directly from the UI
📈 Metrics & Analytics - CloudWatch integration, token usage, cost tracking
🔐 Enterprise Security - Cognito authentication, IAM permissions, secret manager
🚀 Real-time Updates - EventBridge-powered execution tracking

Architecture

Component Overview

graph TB
    subgraph UI["Management UI (Amplify)"]
        Console[Admin Console]
        ExecutionHistory[Execution History]
        Analytics[Analytics Dashboard]
    end

    subgraph Registry["Registries (DynamoDB)"]
        AgentReg[Agent Registry]
        ToolReg[Tool Registry]
        ModelReg[Model Registry]
    end

    subgraph Runtime["Agent Runtime"]
        StepFunctions[Step Functions]
        LLMService[LLM Service]
        Tools[Tool Lambdas]
    end

    Console --> AgentReg
    Console --> ToolReg
    StepFunctions --> LLMService
    StepFunctions --> Tools
    StepFunctions --> AgentReg
    StepFunctions --> ToolReg
    ExecutionHistory --> StepFunctions

Agent Execution Flow

stateDiagram-v2
    [*] --> LoadConfig: Start Execution
    LoadConfig --> LoadTools: Load from Registry
    LoadTools --> CallLLM: Get Tool Definitions
    CallLLM --> UpdateMetrics: LLM Response
    UpdateMetrics --> CheckTools: Record Usage
    CheckTools --> ExecuteTools: Tools Requested
    CheckTools --> Success: No Tools Needed
    ExecuteTools --> CallLLM: Return Results
    Success --> [*]: Complete

Quick Start

Prerequisites

AWS Account with appropriate permissions
Python 3.12+
Node.js 18+ (for CDK and Amplify UI)
AWS CDK CLI: npm install -g aws-cdk
UV for Python: pip install uv

Initial Setup

# Clone the repository
git clone https://github.com/your-org/step-functions-agent.git
cd step-functions-agent

# Install Python dependencies
uv pip install -r requirements.txt

# Bootstrap CDK (first time only)
cdk bootstrap

# Set environment
export ENVIRONMENT=prod

Deploy Core Infrastructure

# 1. Deploy shared infrastructure (once per environment)
cdk deploy SharedInfrastructureStack-prod
cdk deploy AgentRegistryStack-prod

# 2. Deploy LLM service (choose one)
cdk deploy SharedUnifiedRustLLMStack-prod  # Recommended: High-performance unified service

# 3. Configure API keys in AWS Secrets Manager
aws secretsmanager create-secret \
    --name /ai-agent/llm-secrets/prod \
    --secret-string '{
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "OPENAI_API_KEY": "sk-...",
        "GEMINI_API_KEY": "..."
    }'

Deploy Management UI

cd ui_amplify

# Install dependencies
npm install

# Deploy to Amplify (creates hosted UI)
npx ampx sandbox  # For development
# OR
npx ampx pipeline-deploy --branch main  # For production

The UI will be available at your Amplify app URL (e.g., https://main.xxxx.amplifyapp.com)

Building Your First Agent

1. Create Agent Stack

Create a new file stacks/agents/my_agent_stack.py:

from aws_cdk import Fn
from stacks.agents.modular_base_agent_unified_llm_stack import ModularBaseAgentUnifiedLLMStack

class MyAgentStack(ModularBaseAgentUnifiedLLMStack):
    def __init__(self, scope, construct_id, env_name="prod", **kwargs):

        # Import required tools from registry
        db_tool_arn = Fn.import_value(f"DBInterfaceToolLambdaArn-{env_name}")

        # Configure tools for this agent
        tool_configs = [
            {
                "tool_name": "query_database",
                "lambda_arn": db_tool_arn,
                "requires_activity": False
            }
        ]

        # Define agent behavior
        system_prompt = """You are a data analyst assistant.
        Help users query and analyze database information.
        Always explain your findings clearly."""

        # Initialize agent with Unified LLM
        super().__init__(
            scope, construct_id,
            agent_name="data-analyst",
            unified_llm_arn=Fn.import_value(f"SharedUnifiedRustLLMLambdaArn-{env_name}"),
            tool_configs=tool_configs,
            env_name=env_name,
            system_prompt=system_prompt,
            **kwargs
        )

2. Register in app.py

Add to app.py:

from stacks.agents.my_agent_stack import MyAgentStack

# Deploy your agent
MyAgentStack(app, "DataAnalystAgentStack-prod", env_name="prod")

3. Deploy

cdk deploy DataAnalystAgentStack-prod

The agent will automatically register in the Agent Registry and appear in the Management UI!

Building Tools

Tool Structure

lambda/tools/my-tool/
├── index.py              # Lambda handler
├── requirements.txt      # Dependencies
└── tool_definition.json  # Tool schema for LLM

Tool Lambda Handler

def lambda_handler(event, context):
    """
    Standard tool interface compatible with all LLM providers

    Args:
        event: {
            "name": "tool_name",
            "id": "unique_tool_use_id",
            "input": {
                # Tool-specific parameters
            }
        }

    Returns:
        {
            "type": "tool_result",
            "tool_use_id": event["id"],
            "name": event["name"],
            "content": "Result as string or JSON"
        }
    """
    tool_input = event["input"]

    # Implement tool logic
    result = perform_action(tool_input)

    return {
        "type": "tool_result",
        "tool_use_id": event["id"],
        "name": event["name"],
        "content": result
    }

Tool Definition

Create tool_definition.json:

{
  "name": "my_tool",
  "description": "Clear description of what the tool does for the LLM",
  "input_schema": {
    "type": "object",
    "properties": {
      "parameter1": {
        "type": "string",
        "description": "Description of parameter1"
      },
      "parameter2": {
        "type": "number",
        "description": "Description of parameter2"
      }
    },
    "required": ["parameter1"]
  }
}

Create Tool Stack

from aws_cdk import aws_lambda as lambda_, Duration
from constructs import Construct
from .base_tool_stack import BaseToolStack

class MyToolStack(BaseToolStack):
    def __init__(self, scope: Construct, construct_id: str, env_name: str = "prod", **kwargs):
        super().__init__(scope, construct_id, env_name=env_name, **kwargs)

        # Create Lambda function
        tool_lambda = lambda_.Function(
            self, "MyToolFunction",
            runtime=lambda_.Runtime.PYTHON_3_12,
            handler="index.lambda_handler",
            code=lambda_.Code.from_asset("lambda/tools/my-tool"),
            timeout=Duration.seconds(30),
            environment={
                "LOG_LEVEL": "INFO"
            }
        )

        # Register in Tool Registry
        self.register_tool(
            tool_name="my_tool",
            tool_lambda=tool_lambda,
            tool_definition_path="lambda/tools/my-tool/tool_definition.json"
        )

Deploy Tool

cdk deploy MyToolStack-prod

The tool is now available for any agent to use!

Built-in Tools

The framework includes production-ready tools you can use immediately:

Data & Query Tools

SQL Database Tool (DBInterfaceToolStack) - Query databases, execute SQL, analyze data
GraphQL Tool (GraphQLToolStack) - Query GraphQL APIs with type safety
Web Research Tool (WebResearchToolStack) - Web scraping and research

Integration Tools

Microsoft Graph Tool (MicrosoftGraphToolStack) - Office 365, Teams, SharePoint integration
Google Maps Tool (GoogleMapsToolStack) - Location services, geocoding, directions
Firecrawl Tool - Advanced web scraping with AI

Compute Tools

Code Execution Tool (E2BToolStack) - Safe Python/JavaScript code execution
Batch Processor Tool - Process large datasets in parallel
Local Agent Tool - Execute commands on remote machines securely

Monitoring Tools

CloudWatch Tool (CloudWatchToolStack) - AWS metrics, logs, and alarms
Sagemaker Tool - ML model deployment and inference

Deploy any tool:

cdk deploy DBInterfaceToolStack-prod
cdk deploy GoogleMapsToolStack-prod

Management UI Features

Execution History

Fast Indexed Search - DynamoDB-backed execution index for instant queries
Advanced Filtering - Filter by agent, status, date range (UTC-aware)
Real-time Updates - EventBridge integration for live execution tracking
Detailed Views - Full execution trace, token usage, cost breakdown

Agent Management

Dynamic Configuration - Update system prompts without redeployment
Model Selection - Switch LLM providers and models on the fly
Tool Assignment - Add/remove tools from agents via UI
Version Control - Track configuration changes over time

Testing & Validation

Agent Testing - Execute test prompts with custom inputs
Tool Testing - Validate tool functionality independently
Execution Replay - Re-run failed executions with same inputs
Health Checks - Automated validation of agent configurations

Analytics & Monitoring

Cost Tracking - Real-time cost estimates per execution
Token Usage - Input/output token metrics by model
Performance Metrics - Execution duration, error rates, trends
CloudWatch Integration - Deep-dive into logs and traces

Enterprise Features

Security

✅ IAM Integration - Fine-grained access control with AWS IAM
✅ Cognito Authentication - Secure user authentication for UI
✅ Secrets Manager - Encrypted storage for API keys and credentials
✅ VPC Support - Deploy in private subnets with VPC endpoints
✅ Audit Logging - Complete audit trail via CloudWatch and CloudTrail
✅ Resource Tags - Automatic tagging for compliance and cost allocation

Observability

✅ X-Ray Tracing - End-to-end distributed tracing
✅ CloudWatch Metrics - Custom metrics for all operations
✅ Structured Logging - JSON logs with correlation IDs
✅ Execution Index - Fast searchable execution history
✅ Cost Attribution - Track costs by agent, model, and execution

Reliability

✅ Automatic Retries - Built-in retry logic with exponential backoff
✅ Error Handling - Graceful degradation and error recovery
✅ Circuit Breakers - Protect downstream services
✅ Rate Limiting - Prevent API quota exhaustion
✅ Health Checks - Automated monitoring and alerting

Cost Management

✅ Token Tracking - Real-time token usage monitoring
✅ Cost Estimation - Predict execution costs before running
✅ Budget Alerts - CloudWatch alarms for cost thresholds
✅ Model Optimization - Automatic model selection for cost/quality trade-offs
✅ Execution Limits - Configurable limits per agent

LLM Providers

Supported Providers

Provider	Models	Best For	Pricing
Anthropic Claude	Sonnet 4, Opus 3.5	Complex reasoning, long context	$$$
OpenAI	GPT-4o, GPT-4o-mini	Versatile, code generation	$$$
Google Gemini	1.5 Pro, Flash	Multimodal, fast responses	$$
Amazon Bedrock	Nova Pro, Nova Lite	AWS native, cost-effective	$$
xAI	Grok 2, Grok 2 mini	Latest capabilities	$$
DeepSeek	DeepSeek V3	Specialized tasks	$

Provider Configuration

All providers are configured through the Unified Rust LLM Service or individual provider Lambdas. API keys are stored in AWS Secrets Manager.

Update API keys:

aws secretsmanager update-secret \
    --secret-id /ai-agent/llm-secrets/prod \
    --secret-string '{
        "ANTHROPIC_API_KEY": "sk-ant-new-key",
        "OPENAI_API_KEY": "sk-new-key"
    }'

Dynamic Model Selection

Change models via Management UI or agent configuration:

# In agent stack
self.llm_provider = "anthropic"
self.llm_model = "claude-sonnet-4-20250514"

# Or via UI: Agent Management > Select Agent > Update Model

Deployment Patterns

Multi-Environment Strategy

# Development environment
export ENVIRONMENT=dev
cdk deploy SharedInfrastructureStack-dev
cdk deploy MyAgentStack-dev

# Production environment
export ENVIRONMENT=prod
cdk deploy SharedInfrastructureStack-prod
cdk deploy MyAgentStack-prod

Recommended Deployment Order

Core Infrastructure (once per environment)

cdk deploy SharedInfrastructureStack-prod
cdk deploy AgentRegistryStack-prod

LLM Service (choose based on needs)

# High-performance unified service (recommended)
cdk deploy SharedUnifiedRustLLMStack-prod

# OR traditional multi-provider
cdk deploy SharedLLMStack-prod

Tools (deploy only what you need)

cdk deploy DBInterfaceToolStack-prod
cdk deploy GoogleMapsToolStack-prod
cdk deploy WebResearchToolStack-prod

Agents (your custom agents)
```
cdk deploy MyAgentStack-prod
```

Management UI (Amplify)

cd ui_amplify
npx ampx pipeline-deploy --branch main

Monitoring & Operations

CloudWatch Dashboards

Access pre-built dashboards:

Execution Overview - All agent executions, success rates, duration
Cost Analysis - Token usage and estimated costs by model
Error Tracking - Failed executions, error patterns, retry metrics

Example Queries

-- Cost analysis by agent
fields @timestamp, agent_name, model, input_tokens, output_tokens
| stats sum(input_tokens * 0.003 / 1000) as input_cost,
        sum(output_tokens * 0.015 / 1000) as output_cost
  by agent_name, model

-- Execution performance
fields @timestamp, agent_name, duration
| stats avg(duration) as avg_duration,
        max(duration) as max_duration,
        count() as total_executions
  by agent_name

Alerts

Configure CloudWatch Alarms:

High error rate (>5% failures)
Slow executions (>30s duration)
High costs (>$100/day)
Token limit warnings

Documentation

Getting Started

Deployment Guide - Complete deployment walkthrough
Quick Start Tutorial - Build your first agent in 10 minutes

Development Guides

Agent Development - Creating custom agents
Tool Development - Building new tools
Testing Guide - Testing strategies

Advanced Topics

Modular Architecture - System design patterns
Long Content Support - Handling large documents
Human Approval Workflows - Adding approval steps
Activity Testing - Testing remote activities

Operations

Monitoring Guide - Observability setup
Security Best Practices - Security hardening
Cost Optimization - Reducing operational costs
Troubleshooting - Common issues and solutions

Management UI

UI User Guide - Using the admin interface
Execution Index - Fast execution queries
Analytics Dashboard - Using metrics and analytics

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Development Setup

# Create virtual environment
uv venv
source .venv/bin/activate

# Install dev dependencies
uv pip install -r requirements-dev.txt

# Run tests
pytest

# Format code
black .
ruff check .

UI Development

cd ui_amplify

# Install dependencies
npm install

# Run local development server
npm run dev

# Run tests
npm test

Project Structure

step-functions-agent/
├── app.py                      # CDK app entry point
├── stacks/
│   ├── agents/                 # Agent stack definitions
│   ├── tools/                  # Tool stack definitions
│   ├── shared_llm/             # LLM service stacks
│   └── infrastructure/         # Core infrastructure
├── lambda/
│   ├── tools/                  # Tool Lambda functions
│   │   ├── db-interface/
│   │   ├── google-maps/
│   │   └── web-research/
│   └── unified_llm/            # Unified LLM service (Rust)
├── ui_amplify/                 # Management UI (Amplify Gen 2)
│   ├── amplify/                # Backend configuration
│   ├── src/                    # React frontend
│   └── scripts/                # Utility scripts
└── docs/                       # Documentation

Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: docs.your-project.com

License

This project is licensed under the MIT License - see LICENSE for details.

Acknowledgments

AWS Step Functions team for serverless orchestration
Anthropic, OpenAI, Google, Amazon, xAI, and DeepSeek for LLM APIs
AWS Amplify team for the Gen 2 framework
Open-source community for tools and libraries

Built with ❤️ using AWS CDK, Step Functions, and Amplify

Name		Name	Last commit message	Last commit date
Latest commit History 296 Commits
.github/workflows		.github/workflows
buildspec-templates		buildspec-templates
cloudformation		cloudformation
data		data
docs		docs
images		images
lambda		lambda
legacy		legacy
scripts		scripts
stacks		stacks
step-functions		step-functions
test-appsync		test-appsync
test-events		test-events
test_events		test_events
tests		tests
tools		tools
ui		ui
ui_amplify		ui_amplify
wasm-mcp-client		wasm-mcp-client
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
_config.yml		_config.yml
amplify.yml		amplify.yml
app.py		app.py
buildspec-router.yml		buildspec-router.yml
buildspec.yml		buildspec.yml
cdk.context.json		cdk.context.json
cdk.json		cdk.json
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
requirements.in		requirements.in
requirements.txt		requirements.txt
test-addresses.csv		test-addresses.csv
test-travel-times.csv		test-travel-times.csv

License

guyernest/step-functions-agent

Folders and files

Latest commit

History

Repository files navigation

Step Functions AI Agent Framework

Overview

1. AI Agent Runtime (Lambda + Step Functions)

2. Management UI (AWS Amplify)

Key Features

Agent Framework

Management UI

Architecture

Component Overview

Agent Execution Flow

Quick Start

Prerequisites

Initial Setup

Deploy Core Infrastructure

Deploy Management UI

Building Your First Agent

1. Create Agent Stack

2. Register in app.py

3. Deploy

Building Tools

Tool Structure

Tool Lambda Handler

Tool Definition

Create Tool Stack

Deploy Tool

Built-in Tools

Data & Query Tools

Integration Tools

Compute Tools

Monitoring Tools

Management UI Features

Execution History

Agent Management

Testing & Validation

Analytics & Monitoring

Enterprise Features

Security

Observability

Reliability

Cost Management

LLM Providers

Supported Providers

Provider Configuration

Dynamic Model Selection

Deployment Patterns

Multi-Environment Strategy

Recommended Deployment Order

Monitoring & Operations

CloudWatch Dashboards

Example Queries

Alerts

Documentation

Getting Started

Development Guides

Advanced Topics

Operations

Management UI

Contributing

Development Setup

UI Development

Project Structure

Support

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Packages