-
Notifications
You must be signed in to change notification settings - Fork 70
feat: add session logging with JSON configuration #135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
perdue
wants to merge
835
commits into
mpfaffenberger:main
Choose a base branch
from
perdue:perdue/feature/add_session_logging
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
feat: add session logging with JSON configuration #135
perdue
wants to merge
835
commits into
mpfaffenberger:main
from
perdue:perdue/feature/add_session_logging
+89,738
−2,729
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Re-enable integration tests in publish workflow (removed --ignore flag) - Add comprehensive test matrix (Ubuntu/macOS) to PR workflow - Both workflows now run full test suite including integration tests - Maintain separate quality checks job for linting/formatting - Ensure cross-platform compatibility before merging and publishing This catches breaking changes early and prevents regressions from reaching production!
- test_interactive_smoke: handle initial prompts that appear in CI - test_mcp_context7_end_to_end: increase timeout for logs command - Make tests more resilient to different startup behaviors - Tests now handle both configured and fresh startup scenarios This fixes the 2 failing integration tests that were timing out in GitHub Actions CI but passing locally.
- Skip test_file_operations_integration: depends on real LLM calls and timing - Skip test_interactive_smoke: startup timing issues in CI environment - Keep these tests for local development but exclude from CI pipeline - Focus CI on stable unit tests and less fragile integration tests This provides a more reliable CI pipeline while maintaining test coverage locally.
- Skip test_mcp_context7_end_to_end: depends on real MCP server calls and timing - Last remaining flaky integration test that was failing in CI - All other integration tests are now stable or properly skipped - CI pipeline should now be reliable for both Ubuntu and macOS This completes the CI stabilization effort - focus on stable unit and integration tests while keeping comprehensive tests for local development.
- Remove pytestmark skips from all three integration tests - Add robust timeout handling with fallback checks - Increase timeouts for CI environments (120→180s, 60→90s, etc.) - Add conditional assertions for MCP tool calls in CI - Make interactive smoke test more resilient to startup timing - Add better error handling and logging throughout Tests now run consistently without skips while maintaining reliability.
- Add explicit file creation verification with assertions - Add small delay to ensure filesystem operations complete - Add robust debugging info for test directory creation - Make file listing assertion more flexible for CI environments - Add fallback filesystem verification if agent reports empty directory - Better error reporting for actual filesystem issues - Fix missing os import Should fix the Ubuntu CI test failure where agent reported empty test directory.
* feat: Adding edit file permission feature * add syntax highlighting to preview * make file permission edit a plugin * clean up * unit test * atomic messaging * fix: more decoupling * fix import * add more exiplict prompt for rejection * fix the tests * remove unecessary args * remove some misleading prompt
- Remove unused imports (Dict, Optional, emit_error) from file_permission_handler - Add newline at end of __init__.py to comply with POSIX standards - Fix trailing whitespace issues throughout test files - Improve line break consistency in test assertions and error messages - Reformat multi-line strings for better readability in test output
Add comprehensive diff colorization system with user-configurable colors and intelligent foreground/background pairing for maximum contrast. Users can now customize how code diffs are displayed through new `/diff` commands. - Implement `/diff` command with subcommands (style, additions, deletions, show) - Add intelligent color pair system that automatically selects optimal foreground colors based on background choice for accessibility - Support both 'text' mode (plain colored text) and 'highlighted' mode (background highlighting with contrast-optimized text) - Provide curated color recommendations with visual previews for additions (greens) and deletions (oranges/reds) - Add comprehensive color catalog showing all available Rich colors organized by category - Store user preferences in config with new getter/setter functions (get_diff_addition_color, get_diff_deletion_color, get_diff_highlight_style) - Apply colorization to all diff outputs in file_modifications.py - Include live preview of diff appearance when changing settings - Fix config directory initialization in command history setup - Add test retry mechanism to improve resilience of file operation integration tests
…ules - Standardize line breaks and whitespace in command_handler.py color display logic - Improve formatting of multi-line function calls and string formatting - Clean up diff colorization functions in file_modifications.py for better readability - Enhance consistency in config.py diff style example generation - Add better line breaks in test retry logic for improved clarity - Maintain consistent spacing around operators and function definitions - No functional changes, purely formatting improvements for maintainability
Replace automatic autosave restoration on startup with a new /autosave_load command that allows users to interactively load previous sessions on demand. This gives users more control over when and whether to restore autosaved sessions. - Add /autosave_load command with special async handler marker - Remove automatic restore_autosave_interactively call from startup - Handle async autosave loading through command system - Update help text to inform users about manual autosave loading - Fix minor formatting issues in /diff command help text
- Added system message to inform users about the /diff command - Helps users discover diff highlighting color configuration option - Improves user experience by making configuration features more visible
Updated both CLI autosave integration tests to properly wait for CLI readiness and manually trigger autosave loading with /autosave_load command. This ensures the autosave session picker appears consistently rather than relying on automatic detection which could be racey.
The changes make the tests more reliable by:
- Adding explicit wait_for_ready() calls before expecting autosave UI
- Using sendline("/autosave_load\r") to manually trigger the autosave picker
- Improving test stability by removing race conditions in autosave detection
- Update .python-version to pin to Python 3.13 - Add UV-managed Python installation instructions to README - Include permanent configuration options and verification commands - Ensure users get latest Python version instead of old system versions
- Remove .python-version file to let UV handle versions naturally - Update README with one-time setup: export UV_MANAGED_PYTHON=1 - Users can now just run 'uvx code-puppy -i' without special flags - UV automatically downloads latest compatible Python (3.13.3) when needed - Tested: works perfectly even with old system Python (3.9.6)
Changed requires-python from '>=3.11' to '>=3.11,<3.14' to prevent installation with Python 3.14+ when released. The uv.lock file was updated to reflect this dependency constraint change.
…calls (mpfaffenberger#60) * fix: race condition between automatic summarization and pending tool calls Implement comprehensive solution to prevent "Cannot provide a new user prompt when the message history contains unprocessed tool calls" errors during automatic message compaction. Key changes: - Add has_pending_tool_calls() method to detect incomplete tool call sequences - Implement delayed compaction queuing system with request_delayed_compaction() and should_attempt_delayed_compaction() - Add race condition protection in compact_messages() before initiating summarization - Include automatic retry mechanism in run_agent_task() after tool calls complete - Add get_pending_tool_call_count() for debugging and user feedback - Implement global _delayed_compaction_requested flag for cross-call coordination - Provide comprehensive error handling with informative warning messages Features: - Detects ToolCallParts without corresponding ToolReturnParts - Queues compaction requests when tool calls are pending execution - Automatically retries compaction after tool completion without user intervention - Maintains 100% backward compatibility with existing configurations - Provides clear user feedback during deferred compaction scenarios Testing: - Comprehensive test suite validates syntax, method implementation, race condition protection, integration points, error handling, and logic correctness - All 6 test categories pass with 100% success rate - Mock testing confirms accurate detection of 4 test scenarios including complete, pending, empty, and mixed tool call states This fix eliminates crashes while preserving automatic summarization functionality and maintaining system stability during high-volume tool execution scenarios. * style: fix E402 import order - move global variable after imports
* add context diff line config * format
Introduce a new plugin system that allows users to create custom slash commands by placing markdown files in specific directories. This feature enables users to define reusable prompts and commands without modifying the core codebase. - Add customizable_commands plugin with automatic discovery and registration - Support command files in .claude/commands/, .github/prompts/, and .agents/commands/ directories - Implement MarkdownCommandResult class for seamless integration with existing command handler - Update command handler to process markdown command results as input - Add comprehensive documentation with usage examples in README - Include message limit configuration for agent tool invocations
…broad exception handlers The autosave calls were being swallowed by overly broad try/except blocks in both main.py (interactive mode) and tui/app.py (TUI mode), causing autosaves to fail silently without user notification. - Move auto_save_session_if_enabled() call outside the broad exception handler in main.py - Remove silent exception swallowing in TUI mode autosave call - Ensures autosave errors are properly visible to users - Fixes regression caused by race condition fix that introduced overly broad exception handling
- Add new auto_save_session config option with default value "true" - Display auto_save_session status in command handler output - Ensure auto_save_session is set when creating/updating config files - Show current value as enabled/disabled in status information
- Replace manual random suffixes with automatic SHA1 hash generation for session IDs - Update session management to return full session IDs in responses for continuation - Add inline validation and status messages for custom server form - Refine documentation to clarify session creation vs continuation patterns - Update tests to validate new hash-based session ID generation This change simplifies session ID creation for users while ensuring uniqueness through timestamp-based hashing. Users now provide base names for new sessions and use the returned full session ID for continuation, reducing confusion and improving the overall agent invocation workflow.
…open - Refactor base agent edge case tests to mock pathlib.Path.read_text for file operations - Add "cancel_agent_key" to expected config keys in test configuration - Improve test accuracy by matching actual implementation that uses pathlib methods
…berger#131) * feat: enhance command completions with model pinning status - Add display of pinned model in agent list and agent completion meta - Show agents pinned to each model in model completion meta - Support for both built-in and JSON agent pinning status display - Add utility functions to retrieve model-agent pinnings from config - Extend command completion to provide context-aware suggestions - Add comprehensive test coverage for new pinning utilities * fix: resolve ruff linting errors Remove unused import and prefix unused variable with underscore. * style: fix ruff formatting and import ordering
* feat: Add XDG Base Directory Specification support - Add XDG support with backwards compatibility for ~/.code_puppy - Separate files by purpose: config, data, cache, and state - Configuration files → XDG_CONFIG_HOME (~/.config/code_puppy/) - Data files → XDG_DATA_HOME (~/.local/share/code_puppy/) - Cache files → XDG_CACHE_HOME (~/.cache/code_puppy/) - State files → XDG_STATE_HOME (~/.local/state/code_puppy/) - Fallback to ~/.code_puppy if it exists for seamless migration - Update subagent sessions and terminal sessions to use appropriate XDG directories Fixes mpfaffenberger#126 * fix: only use legacy dir if puppy.cfg exists The backward compatibility check was triggering on empty ~/.code_puppy directories created by mkdir. Now it only uses the legacy directory if puppy.cfg actually exists there, preventing false detection of migrated configurations. * fix: Complete XDG directory migration for all remaining hardcoded paths - Update mcp_/registry.py to use XDG_DATA_HOME for mcp_registry.json - Update chatgpt_oauth plugin to use XDG directories - Update claude_code_oauth plugin to use XDG directories - Update browser_workflows to use XDG_DATA_HOME - Update camoufox_manager to use XDG_CACHE_HOME for browser profile - Update motd.py docstring to reference XDG paths This completes the XDG Base Directory Specification migration, ensuring no code creates ~/.code_puppy anymore. All data now properly stored in: - XDG_CONFIG_HOME/code_puppy (~/.config/code_puppy) - XDG_DATA_HOME/code_puppy (~/.local/share/code_puppy) - XDG_CACHE_HOME/code_puppy (~/.cache/code_puppy) - XDG_STATE_HOME/code_puppy (~/.local/state/code_puppy) * docs: Update docstrings to reflect XDG paths instead of ~/.code_puppy - Update error_logging.py to reference XDG_CONFIG_HOME - Update ServerRegistry class docstring to reference XDG_DATA_HOME - Update load_mcp_server_configs docstring to reference XDG_CONFIG_HOME No code changes, only documentation updates for accuracy. * test: Update test_config.py for XDG directory changes - Update mock_config_paths fixture to mock all XDG directories (DATA_DIR, CACHE_DIR, STATE_DIR) - Fix test_no_config_dir_or_file_prompts_and_creates to expect 4 makedirs calls - Fix test_config_dir_exists_file_does_not_prompts_and_creates for XDG loop The ensure_config_exists() function now creates 4 XDG directories instead of 1, so the tests need to be updated to handle the new behavior. * fix: XDG spec compliance - correct directory permissions and log location - Move logs from CONFIG_DIR to STATE_DIR per XDG spec (logs are state data) - Add mode=0o700 to all mkdir/makedirs calls per XDG spec requirement - Fix test assertions expecting '.code_puppy' in paths (now 'code_puppy') Files changed: - code_puppy/error_logging.py: STATE_DIR for logs, 0o700 perms - code_puppy/config.py: 0o700 perms for XDG directories - code_puppy/plugins/chatgpt_oauth/config.py: 0o700 perms - code_puppy/plugins/claude_code_oauth/config.py: 0o700 perms - code_puppy/plugins/chatgpt_oauth/test_plugin.py: fix path assertions - code_puppy/mcp_/registry.py: 0o700 perms - code_puppy/tools/agent_tools.py: 0o700 perms - code_puppy/tools/browser/browser_workflows.py: 0o700 perms - code_puppy/tools/browser/camoufox_manager.py: 0o700 perms
- Check CONFIG_DIR (~/.code_puppy/) for global AGENTS.md files - Combine global and project AGENTS.md when both exist - Global rules load first, project rules second (allowing override) - Fix claude-code models to include AGENTS.md in first message prepend - Maintains backward compatibility with project-only AGENTS.md - Supports all variants: AGENTS.md, AGENT.md, agents.md, agent.md This enables users to: 1. Define global coding standards in ~/.code_puppy/AGENTS.md 2. Override or extend with project-specific rules in ./AGENTS.md 3. Use either global-only or project-only configurations
- Fix load_puppy_rules to return None when files contain empty content, as empty strings are filtered out by list comprehension - Update test mocking in test_base_agent_reload to patch get_use_dbos in the correct module location - Correct OAuth integration tests to handle dynamically set token_storage field - Improve browser workflow and camoufox manager tests by patching config directly rather than mocking home directory - Add proper cleanup and restoration of config values in test teardown
- Relocate automatic max_tokens calculation from BaseAgent to make_model_settings function - Update calculation to use 15% of context length (increased from 5%) with minimum of 2048 and maximum of 65536 - Simplify BaseAgent by removing manual max_tokens computation and hardcoded logic - Improve test coverage by mocking model config instead of get_model_context_length method - Make max_tokens parameter optional in make_model_settings with automatic fallback - Enhance error handling with fallback context length for CI environments
The recent XDG support commit inverted the default behavior, causing tests to fail on fresh CI environments that don't have existing config. Changes: - Default to ~/.code_puppy for all file types (config, data, cache, state) - XDG paths are now opt-in only when XDG env vars are explicitly set - Updated outdated comments about XDG path behavior Fixes failing tests: - test_model_command_with_valid_argument - test_model_command_m_alias - test_model_command_with_unicode_model_name
* Code cleanup and improvements - Updated AGENTS.md - Modified base_agent.py - Updated MCP commands (add, edit, install) - Updated messaging module and renderers - Updated command_runner.py - Removed tui_state.py - Updated pyproject.toml - Updated tests * Update pyproject.toml and uv.lock * chore: add uvicorn and fastapi back to dependencies * build: add FastAPI and Uvicorn dependencies - Added FastAPI (0.124.0) to dependencies to enable web API capabilities - Added Uvicorn (>=0.30.0) as ASGI server for running FastAPI applications - Updated uv.lock to reflect new package versions and transitive dependencies including annotated-doc
- Delete entire ACP module implementation including transport layer, handlers, and state management - Remove ACP-specific code from main entry point and command line argument parsing - Drop agent-client-protocol dependency from project configuration - Remove comprehensive test suite for ACP functionality - Clean up package imports and dependency lock file This change removes the experimental ACP integration that allowed Code Puppy to work with Zed editor and other ACP-compatible clients. The feature was removed to simplify the codebase and focus development resources on core functionality.
- Add comprehensive Gemini OAuth model support with token management - Implement user plugins directory (~/.code_puppy/plugins/) for extensibility - Prevent duplicate plugin loading and callback registration issues - Add centralized OAuth model file paths in config module - Improve plugin loading with separate builtin and user plugin handling - Add Gemini branding for OAuth success pages - Remove duplicate plugin loading code from command handler
- Remove deprecated mock patches for get_chatgpt_models_path and get_claude_models_path functions - Update load_claude_models_filtered import path to new plugin location - Replace simple Path.exists mocking with more sophisticated path resolution logic - Add proper path string matching to distinguish between different model files - Maintain test coverage while adapting to refactored plugin architecture
* Code cleanup and improvements - Updated AGENTS.md - Modified base_agent.py - Updated MCP commands (add, edit, install) - Updated messaging module and renderers - Updated command_runner.py - Removed tui_state.py - Updated pyproject.toml - Updated tests * Update pyproject.toml and uv.lock * chore: add uvicorn and fastapi back to dependencies * build: add FastAPI and Uvicorn dependencies - Added FastAPI (0.124.0) to dependencies to enable web API capabilities - Added Uvicorn (>=0.30.0) as ASGI server for running FastAPI applications - Updated uv.lock to reflect new package versions and transitive dependencies including annotated-doc * feat: Complete messaging system refactor for UI decoupling - Add structured message types (messages.py) with 15+ Pydantic models - Add bidirectional command types (commands.py) for User→Agent communication - Add MessageBus coordinator (bus.py) with request/response correlation - Add RichConsoleRenderer (rich_renderer.py) for all presentation logic - Migrate tools to emit structured messages: - file_operations.py: FileListingMessage, FileContentMessage, GrepResultMessage - command_runner.py: ShellOutputMessage, AgentReasoningMessage - file_modifications.py: DiffMessage with DiffLine objects - Remove Rich markup from ALL emit_* calls across codebase: - version_checker.py, config.py, status_display.py, session_storage.py - main.py, tools/common.py, agents/base_agent.py - All browser tools (8 files) - All MCP commands (14+ files) - command_line modules - Replace direct print() calls with emit_* or sys.stderr.write() - Remove unused Console() instantiations - Fix test_reload_puppy_rules_appended test - Fix test_save_command_to_history_handles_error test Architecture: Messages flow Agent→UI, Commands flow UI→Agent All styling decisions now made by renderer, not by emitting code * fix: Update RichConsoleRenderer to match old Rich-formatted output - _render_file_listing: Add DIRECTORY LISTING header, recursive flag, tree structure with file icons, KB/MB/GB size format, summary section - _render_file_content: Add READ FILE header with path and line range - _render_grep_result: Add GREP header, verbose/concise modes, match highlighting, dividers, and summary with match counts - _render_diff: Add EDIT FILE header, operation icons, proper colors - _render_shell_output: Add SHELL COMMAND header with $ prefix, timing - _render_agent_reasoning: Add AGENT REASONING header with purple bg, markdown rendering, dividers - Add _get_file_icon helper with 60+ file type icons - Update _format_size to use KB/MB/GB format like old output - Add verbose field to GrepResultMessage - Fix line length issues throughout messaging module * fix: Use stdlib queue.Queue instead of asyncio.Queue in MessageBus The MessageBus was using asyncio.Queue which requires a running event loop, but tools emit messages from sync context before any event loop exists. This caused messages to be silently dropped or buffered forever. Changes: - Replace asyncio.Queue with queue.Queue (stdlib) for both outgoing and incoming queues - Update emit() to use put_nowait() directly on sync queue - Update get_message_nowait() to use queue.Empty instead of asyncio - Update provide_response() to use sync queue operations - Remove _ensure_queues() since queues are now created in __init__ - Wrap sync queue in async-friendly loop for get_message()/get_command() Also adds debug logging to _render_sync() to verify message flow. * fix: Remove debug logging and fix syntax error in rich_renderer.py - Removed the temporary debug stderr logging from _render_sync() - Fixed missing closing parenthesis in error message print statement * feat: Emit agent responses through MessageBus for markdown rendering Replace emit_system_message() calls for AGENT RESPONSE with proper AgentResponseMessage emission through the MessageBus. This enables the RichConsoleRenderer to render agent responses with: - Nice header: [bold white on purple] AGENT RESPONSE [/bold white on purple] - Markdown rendering for the content - Divider lines before and after Changes: - main.py: Replace 3 occurrences of emit_system_message(AGENT RESPONSE) with AgentResponseMessage emission via get_message_bus().emit() - rich_renderer.py: Update _render_agent_response() to show header, render markdown, and add dividers - Remove unused emit_system_message import from execute_single_prompt() * fix: Combine DIRECTORY LISTING header onto single line The header was split across two lines, now it's on one line: [bold white on blue] DIRECTORY LISTING [/bold white on blue] 📂 path (recursive=True) * fix: Consolidate tool headers to single print calls - _render_file_content: Build line_info first, then print header in one call - _render_diff: Combine EDIT FILE header with operation/path on single line - _render_grep_result, _render_shell_output: Already correct (single calls) - _render_agent_reasoning, _render_agent_response: Headers already single lines (dividers are intentionally separate) * fix: READ FILE only shows header, not content The file content is for the LLM only, not for display in the UI. This matches the old behavior where only the header was shown. Changes: - _render_file_content: Only print header, remove syntax highlighting - Remove unused Syntax import from rich - Remove unused _get_lexer_for_extension helper method * feat: Add structured sub-agent messages with markdown rendering - Add SubAgentInvocationMessage and SubAgentResponseMessage types - Add rich renderers for sub-agent invocation/response display - Emit structured messages via MessageBus instead of direct console prints - Render agent prompts as markdown for proper formatting * feat: Add ShellStartMessage for immediate command feedback Show command, working directory, and timeout when shell execution begins, before output starts streaming.
- Add defensive checks for None/empty version strings in __init__.py - Handle None current_version gracefully in version_checker with fallback - Replace console.print calls with emit_* functions in tests for consistency - Remove brittle __import__ patching in camoufox test that affected imports - Fix message format in stdio collector test for accurate assertions - Add comprehensive test coverage for version edge cases and error handling - Improve test isolation by mocking emit functions directly instead of console
- Make version messages and system messages consistently dim across all renderers - Remove excessive divider lines from RichConsoleRenderer to reduce visual clutter - Add quiet flag to pip install command to reduce noise during dependency installation - Update test expectations for OAuth headers to include additional User-Agent and x-app headers - Consolidate styling logic for version messages in InteractiveRenderer and SynchronousInteractiveRenderer These changes streamline the user interface by making secondary information (version checks, system messages, dividers) less prominent while maintaining readability of primary content.
Add comprehensive session logging feature to record interactive Code Puppy
sessions in markdown or JSON format for auditing, debugging, and analysis.
**Core Features:**
- Configurable session logging via ~/.code_puppy/session_logging.json
- Auto-creates default config file on first run
- Records user prompts, agent reasoning, responses, tool calls, and outputs
- Support for both markdown and JSON log formats
- Runtime on/off toggle via /session_logging command (persists to config)
**Human-Readable Session IDs:**
- Format: {repo_name}.{YYYYMMDD.HHMMSS}
- Uses git repository name when available
- Falls back to current directory name for non-git projects
- Examples: "code_puppy.20251211.140948.md", "my-scripts.20251211.153022.md"
**Context Clearing:**
- Adds visual separators when /clear is used (🔄 CONTEXT CLEARED)
- Maintains session continuity in single log file
- Clear markers between conversation boundaries
**New Components:**
- code_puppy/session_logging/ package (logger, formatters, config schema)
- docs/SESSION_LOGGING.md (comprehensive documentation)
- session_logging.example.json (configuration reference)
- read_session.py (utility for reading/analyzing session logs)
- Full test coverage (test_session_logging.py, test_session_logging_command.py)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add comprehensive session logging feature to record interactive Code Puppy sessions in markdown or JSON format for auditing, debugging, and analysis.
Core Features:
Human-Readable Session IDs:
Context Clearing:
New Components: