Skip to content

Conversation

@thedaneeffect
Copy link
Contributor

@thedaneeffect thedaneeffect commented Dec 7, 2025

Add /switch command for agent handoff with context preservation

Summary

This PR introduces a new /switch command (with /sw alias) that enables switching between agents while preserving the full conversation context. Unlike the existing /agent command which starts a fresh session, /switch transfers the complete message history to the new agent.

Changes Made

New Features

  • /switch <agent-name> command: Switch to a different agent while preserving conversation history
  • /sw alias: Short alias for quick agent switching
  • Context preservation: Full message history is transferred to the new agent
  • Tab completion support: Auto-complete agent names for both /switch and /sw

Behavior

  • When invoked without arguments, displays current agent and lists all available agents
  • Validates agent name and provides helpful error messages for invalid selections
  • Prevents no-op switches when already using the target agent
  • Shows confirmation with message count after successful handoff

Technical Details

Implementation Approach

  1. Capture before switch: Message history is captured from the current agent before switching
  2. Transfer after switch: History is injected into the new agent using set_message_history()
  3. Agent reload: reload_code_generation_agent() is called to ensure the new agent is properly initialized

Code Pattern

# Capture history BEFORE switching
message_history = list(current_agent.get_message_history())

# Switch to new agent
set_current_agent(agent_name)

# Transfer history to new agent
new_agent = get_current_agent()
new_agent.set_message_history(message_history)

Files Modified

File Changes
code_puppy/command_line/core_commands.py Added handle_switch_command() (~114 lines)
code_puppy/command_line/prompt_toolkit_completion.py Added AgentCompleter triggers for /switch and /sw
tests/command_line/test_switch_command.py 18 comprehensive tests for the new command

Testing

Automated Tests (18 tests)

Test Class Count Coverage
TestSwitchCommandNoArgs 5 Display current/available agents, markers, usage hints
TestSwitchCommandWithAgent 7 Successful switch, history transfer, error handling
TestSwitchCommandFailures 3 Invalid agent, too many args, switch failures
TestSwitchCommandHistoryTransfer 3 Empty/large history, copy behavior
pytest tests/command_line/test_switch_command.py -v  # All 18 pass ✅

Manual Testing

  • /switch without args shows available agents
  • /switch <valid-agent> successfully switches with context
  • /sw <agent> alias works correctly
  • ✅ Tab completion works for agent names

Breaking Changes

None. This is a purely additive feature.

Comparison with /agent

Command Fresh Session Preserves Context
/agent
/switch

mpfaffenberger and others added 30 commits October 19, 2025 15:07
- Add matrix strategy running tests on Ubuntu, Windows, and macOS
- Support Python 3.11, 3.12, 3.13 with strategic exclusions for CI efficiency
- Add Windows-specific dependencies (colorama for console output)
- Skip problematic pexpect interactive CLI tests on Windows
- Fail-fast disabled to see results from all platforms
- Maintain test gating before publishing

Now your code works everywhere! 🌍💻🍎
BREAKING CHANGE: Integration tests now FAIL FAST when env vars are missing!

- Remove all pytestmark skipif conditions for API keys from integration tests
- Require CEREBRAS_API_KEY, CONTEXT7_API_KEY, OPENAI_API_KEY, ANTHROPIC_API_KEY
- Update CLI harness to require env vars instead of falling back to fake keys
- Add CI fallbacks in GitHub workflow to prevent test failures when secrets unavailable
- Make integration tests explicit about their requirements - no silent skipping

Integration tests now have teeth! 🦷 If env vars are missing, tests will FAIL loudly instead of silently skipping. No more false confidence from skipped tests!
- Add debug step to show which secrets are actually set in CI
- Display secret presence and length without revealing actual values
- Help diagnose why CEREBRAS_API_KEY and CONTEXT7_API_KEY aren't being set
- Temporarily add logging to identify the secret configuration issue

This will help us figure out why GitHub secrets aren't making it to the environment!
- Remove duplicate integration test step (unit tests already run them)
- Move environment variable setup before the unified test step
- Keep debug step to diagnose secret configuration issues
- Run all tests (unit + integration) in single pytest command
- Environment variables now available for both unit and integration tests

Cleaner, faster CI with proper env var ordering!
- Remove slow pre-push pytest hook to speed up local development
- Tests now run exclusively in CI where they belong
- Keep pre-commit hooks for linting and formatting (fast, local feedback)
- Faster git pushes while maintaining quality gates in CI

No more waiting for tests on every push! 🚀
Changed ripgrep dependency specification from ">=14.1.0" to "==14.1.0" to ensure consistent behavior across all installations and prevent potential compatibility issues with future versions.
- Skip integration tests in CI workflow by adding --ignore=tests/integration
- Remove pexpect from dev-dependencies and dependency-groups.dev
- This will make CI faster and more reliable by only running unit tests
- Remove Windows builds from CI matrix (windows-latest removed)
- Remove Python 3.11 and 3.12 from matrix, keep only 3.13
- Remove Windows-specific colorama dependency step
- Update build-publish job to use Python 3.13
- This reduces CI from 7 jobs to just 2, much faster!
- Add pexpect>=4.9.0 installation step in CI workflow
- Keeps pexpect available for any integration tests while keeping it out of pyproject.toml
- Installed inline after main dependencies but before test execution
… add invoke_agent and list_agent collaboration capabilities to all reviewers (mpfaffenberger#59)

- Implement agent collaboration framework with domain-specific integration patterns and clear escalation protocols
- Enhance technical depth across all agents with specialized engineering sections covering modern frameworks, patterns, and best practices
- Integrate modern development tooling with specific command syntax and actionable parameters
- Add structured checklists with checkbox format for quality assurance, security validation, and performance optimization
- Implement comprehensive metrics and KPIs framework using industry standards (CVSS v4.0, OWASP ASVS) with quantifiable thresholds
- Standardize verdict terminology and wrap-up consistency while maintaining agent personality
- Expand security auditor with risk quantification, threat modeling, and compliance frameworks
- Enhance QA expert with advanced testing methodologies, mutation testing, and chaos engineering patterns
- Add future-looking technology considerations and enterprise-level expertise to all specializations
Ran automated linters and code formatters across the codebase:
- Standardized import ordering (stdlib, third-party, local imports)
- Fixed line length violations and improved line break positioning
- Normalized whitespace usage between functions, classes, and blocks
- Applied consistent formatting to test assertions and function calls
- Removed trailing whitespace from agent instruction files
- Improved readability of multi-line conditionals and function signatures
- Reorganized conftest.py for better structure and clarity
- Re-enable integration tests in publish workflow (removed --ignore flag)
- Add comprehensive test matrix (Ubuntu/macOS) to PR workflow
- Both workflows now run full test suite including integration tests
- Maintain separate quality checks job for linting/formatting
- Ensure cross-platform compatibility before merging and publishing

This catches breaking changes early and prevents regressions from reaching production!
- test_interactive_smoke: handle initial prompts that appear in CI
- test_mcp_context7_end_to_end: increase timeout for logs command
- Make tests more resilient to different startup behaviors
- Tests now handle both configured and fresh startup scenarios

This fixes the 2 failing integration tests that were timing out in GitHub Actions CI but passing locally.
- Skip test_file_operations_integration: depends on real LLM calls and timing
- Skip test_interactive_smoke: startup timing issues in CI environment
- Keep these tests for local development but exclude from CI pipeline
- Focus CI on stable unit tests and less fragile integration tests

This provides a more reliable CI pipeline while maintaining test coverage locally.
- Skip test_mcp_context7_end_to_end: depends on real MCP server calls and timing
- Last remaining flaky integration test that was failing in CI
- All other integration tests are now stable or properly skipped
- CI pipeline should now be reliable for both Ubuntu and macOS

This completes the CI stabilization effort - focus on stable unit and integration tests while keeping comprehensive tests for local development.
- Remove pytestmark skips from all three integration tests
- Add robust timeout handling with fallback checks
- Increase timeouts for CI environments (120→180s, 60→90s, etc.)
- Add conditional assertions for MCP tool calls in CI
- Make interactive smoke test more resilient to startup timing
- Add better error handling and logging throughout

Tests now run consistently without skips while maintaining reliability.
- Add explicit file creation verification with assertions
- Add small delay to ensure filesystem operations complete
- Add robust debugging info for test directory creation
- Make file listing assertion more flexible for CI environments
- Add fallback filesystem verification if agent reports empty directory
- Better error reporting for actual filesystem issues
- Fix missing os import

Should fix the Ubuntu CI test failure where agent reported empty test directory.
mpfaffenberger and others added 23 commits November 29, 2025 21:26
Introduce a comprehensive terminal UI for browsing and installing MCP servers with a split-panel interface that displays categories on the left and detailed information on the right. The new system provides an intuitive way to explore available servers, view their requirements, and install them with minimal configuration.

Key additions:
- Split-panel TUI with category browsing and server details preview
- Support for both catalog servers and custom server configurations
- Interactive form for adding custom MCP servers with JSON validation
- Paginated navigation for large server catalogs
- Real-time validation and helpful hints for environment variables
- Seamless integration with existing MCP manager and configuration persistence

The interface improves the user experience by replacing the previous wizard-style installation with a more visual and discoverable approach, making it easier to find and install the right MCP server for specific needs.
…-code support

- Add /mcp edit command with TUI form for modifying existing server configurations
- Create model_utils module to centralize claude-code model handling across all agents
- Implement MCP tool cache in BaseAgent for accurate context overhead token estimation
- Add estimate_context_overhead_tokens() method accounting for system prompts and tool definitions
- Refactor all agents to use prepare_prompt_for_model() utility for consistent claude-code handling
- Update token estimation to include MCP tools and refresh cache after server start/stop operations
- Fix spinner display artifacts by setting transient=True and reordering approval message flow
- Add comprehensive test coverage for model_utils and MCP tool cache functionality
- Enhance custom server form with syntax highlighting and support for edit mode
- Remove excessive logging from ManagedMCPServer to reduce noise
The autosave was missing the latest message because get_message_history()
relies on a history_processors callback that may not capture the final
assistant response before autosave triggers.

Fixed by explicitly updating message history with result.all_messages()
after each agent run, which contains the complete conversation including
the final response. Applied fix to both:
- Main interactive loop
- Initial command handling
- Changed risk field type to exclude None values, which was causing incorrect None assignments instead of "none" for low-risk commands
- Fixes buggy behavior where simple commands like pytest would receive None risk assessments instead of proper "none" risk levels
- Ensures ShellSafetyAssessment always returns a valid risk classification from the defined enum values

Co-authored-by: cellwebb <[email protected]>
- Implement new error_logging module to write detailed error information to ~/.code_puppy/logs/errors.log
- Add structured logging for exceptions with timestamps, context, and optional tracebacks
- Integrate error logging into BaseAgent exception handling for better debugging visibility
- Add monkey patches in main.py to handle Pydantic AI message history validation issues
- Include comprehensive test suite covering all error logging functionality
- Ensure graceful failure when logging system itself encounters errors
- Replace manual random suffixes with automatic SHA1 hash generation for session IDs
- Update session management to return full session IDs in responses for continuation
- Add inline validation and status messages for custom server form
- Refine documentation to clarify session creation vs continuation patterns
- Update tests to validate new hash-based session ID generation

This change simplifies session ID creation for users while ensuring uniqueness through timestamp-based hashing. Users now provide base names for new sessions and use the returned full session ID for continuation, reducing confusion and improving the overall agent invocation workflow.
…open

- Refactor base agent edge case tests to mock pathlib.Path.read_text for file operations
- Add "cancel_agent_key" to expected config keys in test configuration
- Improve test accuracy by matching actual implementation that uses pathlib methods
Add /switch (alias /sw) command that transfers conversation history when
switching agents, unlike /agent which starts a fresh session.

- Captures message history from current agent before switching
- Transfers full history to new agent after switch
- New agent can see and build upon previous agent's work
- Tab completion support for /switch and /sw

Usage:
  /switch python-pro    # Switch with full context
  /sw qa-expert         # Short alias
@thedaneeffect
Copy link
Contributor Author

thedaneeffect commented Dec 7, 2025

Partially addresses #74 — introduces manual agent handoff with full message history preservation via /switch command.

Remaining for full #74: automatic persistence across restarts, planning agent orchestration, and central session knowledge base.

@thedaneeffect thedaneeffect marked this pull request as draft December 7, 2025 03:34
Add 18 tests to TestHandleSwitchCommand in test_core_commands_extended.py
following the existing pattern of one test class per command.

Tests cover:
- Display current agent and available agents (5 tests)
- Successful switch with history transfer (7 tests)
- Error handling and edge cases (3 tests)
- History transfer behavior (3 tests)

All tests follow existing patterns:
- Single test file with class-per-command organization
- Module-level imports from core_commands
- Consistent mock/patch patterns
- Descriptive docstrings
@mpfaffenberger
Copy link
Owner

I think for the remaining stuff in #74 I've largely tackled this already. The planning agent and other orchestrators (like if you make your own using the agent-creator, or just have code-puppy dispatch sub-agents) can invoke their sub-agents using a session ID optionally if they want to have a full multi-turn convo. Otherwise they can omit the session ID and do a fresh memory for the sub-agent.

We also have persistence across restarts with /resume - I don't think I had added that when @diegonix opened the issue. Let's see if that solves it for him.

As for turning into a knowledge base, that's more complex, and maybe should be considered against a number of different factors/aspects. I'm also thinking some major decoupling should be in place for any type of KB system to be integrated - I'd like it to live in a plugin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants