🤖 Universal AI Agent

A professional, autonomous AI assistant that works on any computer — Windows, macOS, or Linux

Transform natural language instructions into automated actions. This is your personal digital operator that can research, create documents, manage files, execute code, and automate complex workflows — all through a beautiful terminal interface.

✨ Highlights

🎨 Beautiful Terminal UI: Fancy interface inspired by modern CLI tools with rich formatting, colors, and animations
🌍 Cross-Platform: Works seamlessly on Windows, macOS, and Linux with Python ≥3.10
🧠 Multi-Model Support: OpenAI, Anthropic Claude, Google Gemini, or local models via Ollama
🔒 Safe by Default: Sandboxed workspace, tool permissions, and explicit approval for dangerous operations
🔧 Extensible Tools: Easily add new capabilities with a simple decorator pattern
💾 Persistent Memory: SQLite-based conversation history with automatic cleanup
🎯 Multiple Modes: Interactive chat, single-goal execution, task files, or daemon mode
📊 Smart Planning: Automatically breaks down complex goals into actionable steps

Quickstart

# 1) Create a virtual env (recommended) and install deps
python -m venv .venv && . .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install --upgrade -r requirements.txt

# 2) Configure a model
cp .env.example .env
# Edit .env and add your API key:
#   - OpenAI: OPENAI_API_KEY=sk-...
#   - Anthropic: ANTHROPIC_API_KEY=sk-ant-...
#   - Gemini: GEMINI_API_KEY=...
#   - Ollama: No API key needed (local)

# 3) Run the agent!
# Interactive mode (recommended for first time)
python -m agent.run --interactive

# Or execute a single goal
python -m agent.run "Find the top 5 universities in Germany for AI and create a report"

# Or run a YAML task file
python -m agent.run --task examples/sample_task.yaml

# Or start daemon mode to watch for task files
python -m agent.run --auto examples/inbox

Windows PowerShell one‑liner

python -m venv .venv; .\.venv\Scripts\Activate.ps1; pip install -r requirements.txt; cp .env.example .env

Note: By default, tools that can change your system (e.g., shell, write files) require confirmation. You can allow specific tools in .env or per‑run with --yes TOOL_NAME.

🎯 Usage Modes

1. Interactive Mode (Recommended)

python -m agent.run --interactive

Launches a beautiful interactive terminal where you can chat with the agent, see real-time progress, and execute multiple tasks in one session.

Commands in interactive mode:

help - Show available commands
tools - List all available tools
history - View conversation history
clear - Clear the screen
reset - Reset conversation
exit or quit - Exit the agent

2. Single Goal Mode

python -m agent.run "Your goal here"

Examples:

python -m agent.run "Search for recent news about quantum computing and summarize the top 3 articles"

python -m agent.run "Create a Python script that organizes files by extension"

python -m agent.run "Fetch https://example.com and create a markdown summary"

3. Task File Mode

Create a YAML task file:

goal: "Research AI universities and create a report"
steps:
  - "Search for top AI universities in Germany"
  - "Gather information about each university"
  - "Create a structured markdown report"

Run it:

python -m agent.run --task mytask.yaml

4. Daemon Mode (Auto-watch)

python -m agent.run --auto ./inbox

The agent watches a folder and automatically processes any .task.yaml files that appear.

🔧 Built-in Tools

The agent comes with powerful built-in tools:

🌐 Web Tools

web.browser_search - Search the web using DuckDuckGo
web.fetch_url - Fetch and extract text from URLs

📁 File Tools

files.read_file - Read files from workspace
files.write_file - Write files to workspace (requires approval)

📄 Document Tools

docs.create_markdown - Create structured markdown documents
docs.create_report - Generate professional reports
docs.create_list - Create formatted lists
docs.create_table - Generate markdown tables
docs.summarize_text - Summarize long text
docs.count_words - Count words, sentences, characters

💻 Code Tools

code.execute_python - Execute Python code safely (requires approval)
code.analyze - Analyze code for syntax errors
code.format - Format Python code
code.generate_template - Generate code templates (Flask, FastAPI, CLI, etc.)

🖥️ System Tools

system.info - Get system information
system.list_processes - List running processes
system.current_time - Get current date/time
system.list_directory - List directory contents
system.disk_usage - Check disk usage
system.network_info - Get network information

⚡ Shell Tools

shell.run - Execute shell commands (requires approval)

🎨 Project Structure

universal-agent-starter/
├── agent/
│   ├── run.py                # Main CLI entry point
│   ├── interactive.py        # Interactive mode with fancy UI
│   ├── engine.py             # Core agent engine with planning
│   ├── model.py              # Multi-model support (OpenAI/Anthropic/Gemini/Ollama)
│   ├── registry.py           # Tool registry and discovery
│   ├── schema.py             # Data models (Message, Action, TaskPlan)
│   ├── memory/
│   │   └── sqlite_memory.py  # Persistent conversation memory
│   ├── tools/                # Extensible tool system
│   │   ├── web.py            # Web search and fetching
│   │   ├── files.py          # File operations (sandboxed)
│   │   ├── documents.py      # Document generation
│   │   ├── code.py           # Code execution and generation
│   │   ├── system.py         # System information
│   │   └── shell.py          # Shell command execution
│   ├── agents/
│   │   └── planner.py        # Goal planning and decomposition
│   └── ui/
│       └── terminal.py       # Beautiful terminal UI components
├── examples/
│   ├── sample_task.yaml      # Example task file
│   └── inbox/                # Folder for daemon mode
├── workspace/                # Sandboxed workspace for file operations
├── requirements.txt          # Python dependencies
├── .env.example              # Configuration template
└── README.md                 # This file

🔒 Security & Permissions

Safety is a core principle. The agent implements multiple security layers:

Sandboxed Workspace

All file operations are restricted to the workspace/ folder
Files cannot be read or written outside this directory
Prevents accidental damage to your system

Tool Permissions

Tools are categorized by risk level:

Safe (auto-approved):

Web searches and fetching
Reading files from workspace
Document generation
System information queries

Requires Approval:

Writing files
Executing code
Running shell commands

Pre-approval Options

Option 1: Environment variable

# In .env file
ALLOW_TOOLS=files.write_file,code.execute_python,shell.run

Option 2: Command-line flag

python -m agent.run --yes files.write_file --yes shell.run "Your goal"

Option 3: Interactive approval The agent will ask for permission when needed during execution.

🔌 Extending with Custom Tools

Adding new capabilities is incredibly simple:

1. Create a new tool file

Create agent/tools/mytool.py:

from agent.registry import tool

@tool(name="math.add", desc="Add two numbers")
def add(a: float, b: float) -> float:
    """Add two numbers and return the result"""
    return a + b

@tool(name="math.fibonacci", desc="Calculate Fibonacci number at position n")
def fibonacci(n: int) -> int:
    """Calculate the nth Fibonacci number"""
    if n <= 1:
        return n
    a, b = 0, 1
    for _ in range(n - 1):
        a, b = b, a + b
    return b

@tool(name="text.reverse", desc="Reverse a string", permission="allow")
def reverse_text(text: str) -> str:
    """Reverse the input text"""
    return text[::-1]

2. That's it!

The agent automatically discovers and registers all tools in the agent/tools/ directory.

Tool Decorator Parameters

name (required): Unique identifier (use namespace.action format)
desc (required): Description for the LLM to understand when to use it
permission (optional): "allow" (auto-approved), "ask" (requires approval), "deny" (blocked)

Type Hints

The registry automatically infers parameter types from Python type hints:

str → string
int, float → number
bool → boolean
Parameters without defaults are required
Parameters with defaults are optional

🌟 Example Use Cases

Research & Documentation

python -m agent.run "Research the top 5 programming languages in 2024 and create a comparison report"

File Organization

python -m agent.run "Organize all files in the workspace by extension into separate folders"

Web Scraping & Analysis

python -m agent.run "Fetch the latest blog posts from https://example.com/blog and summarize them"

Code Generation

python -m agent.run "Create a Flask REST API with user authentication and CRUD operations"

Data Processing

python -m agent.run "Read data.csv, analyze it, and create a summary report with statistics"

Automated Workflows

python -m agent.run "Search for Python best practices, create a checklist, and save it as a markdown file"

🎨 Terminal UI Features

The agent features a beautiful, modern terminal interface:

Rich Formatting: Colors, tables, panels, and syntax highlighting
Progress Indicators: Spinners and progress bars for long operations
Structured Output: Clear sections for goals, steps, and results
Tool Visualization: See exactly what tools are being called and their results
Interactive Prompts: Auto-suggestions and command history
Error Handling: Clear, helpful error messages

🤝 Contributing

Contributions are welcome! Here are some ideas:

Add new tools (email, calendar, database, APIs)
Improve the planning algorithm
Add support for more LLM providers
Enhance the UI with more visualizations
Create example task files for common workflows
Write tests and improve documentation

📝 License

MIT License - do whatever you want, just be nice and safe!

🙏 Acknowledgments

Built with:

Rich - Beautiful terminal formatting
Typer - CLI framework
Pydantic - Data validation
Prompt Toolkit - Interactive prompts

Inspired by modern AI agents and autonomous systems.

To do list

voice command
ask for permissions from the user (for the coding assistant or termianl permissions)
add more tools such as opening vs code, opening microsoft to edit and write and read, google meet, drive etc..
develop the interaction terminal by make it write in a real time
for search tool not just search it should show me the search by opening a browser like the GPT do (i think this is the hardest part)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
universal-agent-starter		universal-agent-starter
LICENSE		LICENSE
README.md		README.md

License

Esmail-ibraheem/Universal

Folders and files

Latest commit

History

Repository files navigation