- The Problem: Stateless LLMs in Structured Conversations
- The Solution: Finite State Machines + LLMs
- Project Structure
- Installation
- Quick Start
- Core Components
- Examples
- Documentation
- Development
- Contributing
- License
Large Language Models have revolutionized natural language processing with their remarkable generation capabilities. However, they have a fundamental limitation: they are inherently stateless. Each interaction is processed independently with only the context provided in the prompt.
This statelessness creates significant challenges for building robust conversational applications:
- State Fragility: Without explicit tracking, conversations easily lose their place
- Context Limitations: As conversations grow, context windows fill up quickly
- Transition Ambiguity: Determining when to move to different conversation stages is difficult
- Information Extraction Inconsistency: Extracting structured data from free-form text is unreliable
- Validation Challenges: Ensuring required information is collected before proceeding is complex
Consider a flight booking scenario:
User: I'd like to book a flight
System: Where would you like to fly to?
User: I'm thinking maybe Hawaii
System: Great choice! And where will you be departing from?
User: Actually, I'd prefer Bali instead of Hawaii
Without explicit state tracking, the system might miss the change in destination or maintain inconsistent information.
LLM-FSM elegantly combines classical Finite State Machines with modern Large Language Models:
"We keep the state as a JSON structure inside the system prompt of an LLM, describing transition nodes and conditions for that specific state, along with any emittance of symbols that the LLM might do."
The state and transitions are handled by python and language ambiguities are handled by the LLM.
This hybrid approach gives you the best of both worlds:
- ✅ Predictable conversation flows with clear rules and transitions
- ✅ Natural language understanding powered by state-of-the-art LLMs
- ✅ Persistent context across the entire conversation
- ✅ Dynamic adaptation to user inputs
- ✅ Expressive Logic for complex transitional decision-making using JsonLogic
- ✅ Extensible Workflows for building complex, multi-step automated processes
- ✅ Customizable Handlers for integrating external logic and side effects
- 🚦 Structured Conversation Flows: Define states, transitions, and conditions in JSON.
- 🧠 LLM-Powered NLU: Leverage LLMs for understanding, entity extraction, and response generation.
- 🎣 Handler System: Integrate custom Python functions at various lifecycle points of FSM execution.
- 👤 Persona Support: Define a consistent tone and style for LLM responses.
- 📝 Persistent Context Management: Maintain information throughout the conversation.
- 🔄 Provider-Agnostic: Works with OpenAI, Anthropic, and other LLM providers via LiteLLM.
- 📊 Visualization & Validation: Built-in CLI tools to visualize FSMs as ASCII art and validate definitions.
- 🪵 Comprehensive Logging: Detailed logs via Loguru for debugging and monitoring.
- 🧪 Test-Friendly: Designed for easy unit testing and behavior verification.
- 🧮 JsonLogic Expressions: Powerful conditional logic for logic based FSM transitions.
- 🧩 Workflow Engine: Build complex, automated processes on top of FSMs with the
llm_fsm_workflows
extension.
To install the core LLM-FSM library:
pip install llm-fsm
To include the workflows extension (for building automated multi-step processes):
pip install llm-fsm[workflows]
To install all optional features:
pip install llm-fsm[all]
- Clone the repository:
git clone https://github.com/nikolasmarkou/llm-fsm.git cd llm-fsm
- Install dependencies:
pip install -r requirements.txt pip install -e .[dev,workflows] # Install in editable mode with dev and workflows extras
- Set up environment variables:
cp .env.example .env # Edit .env with your API keys (e.g., OPENAI_API_KEY) and default model
Ensure you have your LLM API key set as an environment variable. For OpenAI, it's OPENAI_API_KEY
.
You can also specify the default LLM_MODEL
in your .env
file (e.g., LLM_MODEL=gpt-4o-mini
).
The LLM_FSM
class provides a high-level interface:
from llm_fsm import LLM_FSM
import os
# Ensure OPENAI_API_KEY is set in your environment
# export OPENAI_API_KEY='your-key-here'
# Create the LLM-FSM instance from an FSM definition file
fsm = LLM_FSM.from_file(
path="examples/basic/simple_greeting/fsm.json",
model="gpt-4o-mini", # Or your preferred model from .env
# api_key="your-api-key" # Can be omitted if OPENAI_API_KEY is set
)
# Start a conversation (empty message for initial greeting)
conversation_id, response = fsm.converse("")
print(f"System: {response}")
# Continue conversation
while not fsm.is_conversation_ended(conversation_id):
user_input = input("You: ")
if user_input.lower() == "exit":
break
_, response = fsm.converse(user_input, conversation_id)
print(f"System: {response}")
# Get collected data (if any)
data = fsm.get_data(conversation_id)
print(f"Collected data: {data}")
# End the conversation
fsm.end_conversation(conversation_id)
print("Conversation ended.")
LLM-FSM provides several command-line tools:
- Run a conversation:
llm-fsm --fsm path/to/your/fsm.json
- Visualize an FSM definition:
llm-fsm-visualize --fsm path/to/your/fsm.json # For different styles: llm-fsm-visualize --fsm path/to/your/fsm.json --style compact llm-fsm-visualize --fsm path/to/your/fsm.json --style minimal
- Validate an FSM definition:
llm-fsm-validate --fsm path/to/your/fsm.json
- Run Workflows (if
workflows
extra is installed): (Details for workflow CLI to be added as the feature matures)# Example: llm-fsm-workflow --workflow path/to/workflow.json --run
Located in src/llm_fsm/
, this is the heart of the library.
FSMDefinition
(definitions.py
): Pydantic models for defining FSMs (states, transitions, conditions).FSMManager
(fsm.py
): Manages FSM instances, state transitions, and context.LLMInterface
(llm.py
): Interface for LLM communication, withLiteLLMInterface
for broad provider support.PromptBuilder
(prompts.py
): Constructs structured system prompts for the LLM.JsonLogic Expressions
(expressions.py
): Evaluates complex conditions for transitions.Validator
(validator.py
): Validates FSM definition files.Visualizer
(visualizer.py
): Generates ASCII art for FSMs.
Provides a way to inject custom Python logic at various points in the FSM execution lifecycle (e.g., before/after processing, on context update, on state transition). See the FSM Handler Integration Guide.
A powerful, JSON-based way to define complex conditions for state transitions. These expressions are evaluated against the current conversation context.
Located in src/llm_fsm_workflows/
, this extension builds upon the core FSM to enable more complex, automated processes.
WorkflowDefinition
(definitions.py
): Defines a sequence of steps.WorkflowStep
(steps.py
): Abstract base class for various step types (API calls, conditions, LLM processing, etc.).WorkflowEngine
(engine.py
): Executes workflow instances, manages state, and handles events.DSL
(dsl.py
): A fluent API for programmatically creating workflow definitions.
The examples/
directory contains various FSM definitions and run scripts:
simple_greeting
: A minimal FSM with greeting and farewell.form_filling
: A step-by-step form for information collection.book_recommendation
: A conversational loop for recommending books.story_time
: An interactive storytelling FSM.dialog_persona
: Demonstrates using a detailed persona for the LLM.
product_recommendation_system
: A decision-tree conversation for tech product recommendations.
yoga_instructions
: An FSM that adapts yoga instruction based on user engagement.
Each example typically includes:
fsm.json
: The FSM definition.run.py
: A Python script to run the example.README.md
: Explanation of the example.
- Testing: Run tests using
tox
ormake test
(which usespytest
).tox # or make test
- Linting & Formatting: Uses
flake8
andblack
. Configured intox.ini
and.pre-commit-config.yaml
.tox -e lint
- Building: Use
make build
to create wheel and sdist packages.make build
- Cleaning: Use
make clean
to remove build artifacts.
- LLM Reference (LLM.md): A detailed guide designed for LLMs to understand the framework's architecture, system prompt structure, and expected response formats.
Contributions are welcome! Please feel free to submit pull requests or open issues.
(A more formal CONTRIBUTING.md
can be added later).
This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.