Skip to content

Conversation

@x22x22
Copy link

@x22x22 x22x22 commented Oct 30, 2025

TLDR

This PR implements a comprehensive Stream JSON migration that enables structured JSON I/O for the CLI, supporting both standard JSON and streaming JSON formats. It introduces a new non-interactive mode framework with session management and extensive test coverage. Key additions include:

  • Stream JSON Support: New --output-format stream-json and --input-format stream-json options for bidirectional JSON communication
  • Non-Interactive Framework: Complete refactoring of non-interactive CLI with session management and protocol handling
  • JSON Adapters: Three-tier adapter architecture (BaseJsonOutputAdapter, JsonOutputAdapter, StreamJsonOutputAdapter) for flexible output handling
  • Enhanced Configuration: Improved interactive/non-interactive mode detection with priority-based logic

Note: Control plane infrastructure is included but marked as under construction (SDK-relevant). See the Control Plane Architecture section for details.

Important for reviewers: This PR focuses on JSON/stream-json output functionality and resolves all known issues. SDK code has been moved to a separate branch to keep this PR focused on CLI output format capabilities.

Known Issues

  • User message of subagent execution
  • Unexpected tool use failure messages
  • Subagent fails to execute
  • No subagent execution message emitted
  • Tools permission denial tracking (Note: tool call permission denial in subagent will not be tracked for now.)
  • is_error state of failing tool use
  • Calculate right usages info
  • message_delta event emitting

Todo Actions

  • Type System Alignment: Refine types in nonInteractive/types.ts, align them with claude-code-like types to gain compatibility
  • Entry Point Refactoring: Refactor gemini.tsx and nonInteractiveCli.ts for better support of the arguments matrix:
    • [input-format, output-format] × [text, json, stream-json]
    • Gracefully shut down
  • Comprehensive Testing: All input/output format combinations
  • Documentation & Examples: Create docs and examples for normal use cases
    • Basic query/response flows
    • JSON output format usage examples
  • Behavior Comparison: Let users know the behavior differences with *

Control Plane Architecture (Under Construction)

⚠️ Important Note: The control plane architecture (ControlContext, ControlDispatcher, ControlService, and controllers) is SDK-relevant content and is currently under construction. While the infrastructure has been implemented in this PR to support future SDK integration, the control plane APIs and protocols are not yet finalized and may change in future PRs.

The control plane components are included in this PR to provide the foundation for bidirectional communication between the CLI and external SDK clients. However, reviewers should focus primarily on the JSON output format functionality, which is the main focus of this PR.

Control Plane Components

  • ControlContext (control/ControlContext.ts): Shared session state (Layer 1)
  • ControlDispatcher (control/ControlDispatcher.ts): Protocol-level routing and message handling (Layer 2)
  • ControlService (control/ControlService.ts): Programmatic API for internal CLI usage (Layer 3)
  • Controllers (control/controllers/): Modular controllers for different domains
    • SystemController: System-level operations
    • PermissionController: Tool permission handling
    • McpController: MCP server integration
    • HookController: Hook execution management

Dive Deeper

Background: Existing Implementation

The previous commits (task1-2-2 through task1-2-5) implemented initial stream-json support in packages/cli/src/streamJson/:

  • types.ts: Message envelopes and type definitions
  • writer.ts: StreamJsonWriter for output serialization
  • controller.ts: StreamJsonController for control request handling
  • input.ts: Input parsing and validation
  • session.ts: Session management (runStreamJsonSession)
  • io.ts: I/O utilities (extractUserMessageText, writeStreamJsonEnvelope)

What's New: Non-Interactive Framework Architecture

This PR introduces a comprehensive non-interactive mode framework in packages/cli/src/nonInteractive/:

1. Non-Interactive Mode Framework (packages/cli/src/nonInteractive/):

  • Session Management (session.ts): SessionManager handles multi-turn conversations in stream-json mode
    • State machine for session lifecycle (initializing, idle, processing_query, shutting_down)
    • FIFO user message queue for sequential processing
    • Message routing (control vs user messages)
    • Graceful shutdown handling

Note: Control plane components (ControlContext, ControlDispatcher, ControlService, and controllers) are SDK-relevant and under construction. See the Control Plane Architecture section above for details.

2. JSON Output Adapters (nonInteractive/io/):

  • BaseJsonOutputAdapter (BaseJsonOutputAdapter.ts): Core functionality for message serialization, block management, and state tracking (1,172 lines)

    • Shared logic for message building
    • State management for main agent and subagent messages
    • Permission denial tracking
    • Content block handling (text, thinking, tool_use)
  • JsonOutputAdapter (JsonOutputAdapter.ts): Standard JSON output for single-request responses

    • Collects all messages and emits as a single JSON array at the end
    • Used with --output-format json
  • StreamJsonOutputAdapter (StreamJsonOutputAdapter.ts): Streaming JSON that emits messages incrementally

    • Emits messages immediately as they are completed
    • Optional partial message support via --include-partial-messages
    • Used with --output-format stream-json
  • StreamJsonInputReader (StreamJsonInputReader.ts): Reads JSON protocol messages from stdin in stream-json mode

    • Parses JSON Lines format
    • Validates message structure
    • Handles control and user messages

3. Protocol Types (nonInteractive/types.ts):

  • Comprehensive TypeScript types for all protocol messages
  • Control request/response types
  • Stream event types
  • Type guards and validation utilities
  • CLI message types (user, assistant, system, result, partial)

4. Configuration Enhancements:

  • New CLI options: --input-format (text/stream-json), --include-partial-messages (boolean)
  • Enhanced mode detection with priority: explicit -i flag → JSON formats with query/prompt → TTY detection
  • Proper handling of input/output format combinations
  • Stream-json format configuration in packages/cli/src/config/config.ts
  • Configuration options in packages/core/src/config/config.ts
  • Stream-json output types in packages/core/src/output/types.ts

5. Core Integration:

  • Updated coreToolScheduler to support permission denial tracking
  • Enhanced error handling for JSON output formats
  • Improved subagent tool call message handling
  • Telemetry integration updates

Key Features

  • Streaming JSON Protocol: Supports bidirectional JSON communication for SDK integration
  • Partial Messages: Optional inclusion of partial assistant messages during streaming
  • Permission Tracking: New CLIPermissionDenial interface for tracking denied tool calls
  • Multi-Turn Sessions: Session manager maintains conversation state across multiple turns
  • Type Safety: Comprehensive TypeScript types for all protocol messages and adapters

Testing

Extensive test coverage added:

  • BaseJsonOutputAdapter.test.ts
  • StreamJsonOutputAdapter.test.ts
  • JsonOutputAdapter.test.ts
  • ControlDispatcher.test.ts
  • session.test.ts
  • StreamJsonInputReader.test.ts
  • Plus updates to existing test suites

Reviewer Test Plan

Basic Functionality Tests

  1. Standard JSON Output:

    echo "Hello, how are you?" | qwen --output-format json

    Verify: Output is valid JSON with complete assistant message

  2. Stream JSON Output:

    echo "Write a short story" | qwen --output-format stream-json

    Verify: Multiple JSON lines emitted, each line is valid JSON

  3. Stream JSON with Partial Messages:

    echo "Count to 10" | qwen --output-format stream-json --include-partial-messages

    Verify: Partial messages are included in the stream

  4. Stream JSON Input Mode:

    # Create a test JSON input file
    echo '{"type":"user","message":{"role":"user","content":"Hello"}}' | qwen --input-format stream-json --output-format stream-json

    Verify: CLI reads and processes JSON input correctly

Advanced Scenarios

  1. Multi-Turn Session:

    • Use stream-json mode with multiple user messages
    • Verify session state is maintained correctly
    • Check that tool calls work across turns
  2. Permission Denial Tracking:

    • Test tool calls that require permission
    • Verify permission denials are tracked and reported in JSON output
    • Check error handling for denied tools
  3. Error Handling:

    • Test invalid JSON input
    • Test malformed protocol messages
    • Verify error messages are properly formatted in JSON output

Integration Tests

Run the existing test suite:

npm test

Focus on:

  • packages/cli/src/nonInteractive/io/*.test.ts: JSON adapter tests
  • packages/cli/src/nonInteractive/session.test.ts: Session management tests
  • packages/cli/src/nonInteractive/control/ControlDispatcher.test.ts: Control dispatcher tests
  • packages/cli/src/nonInteractiveCli.test.ts: Enhanced test coverage
  • packages/core/src/core/coreToolScheduler.test.ts: Permission tracking tests

Key Areas to Review

New Architecture Implementation:

  • nonInteractive/io/BaseJsonOutputAdapter.ts: Core message serialization and state management
  • nonInteractive/io/JsonOutputAdapter.ts: Standard JSON output implementation
  • nonInteractive/io/StreamJsonOutputAdapter.ts: Streaming JSON output implementation
  • nonInteractive/io/StreamJsonInputReader.ts: JSON input parsing
  • nonInteractive/session.ts: Session state machine and message routing
  • nonInteractive/types.ts: Protocol message types

Note: Control plane components (nonInteractive/control/) are SDK-relevant and under construction. Focus review on JSON output adapters and session management.

Integration Points:

  • Entry point: nonInteractiveCli.ts with format detection
  • Integration with existing CLI flow (gemini.tsx)
  • Configuration handling for input/output formats

Testing:

  • Comprehensive unit tests for all adapters
  • Session management tests
  • Integration tests for format combinations

Testing Matrix

🍏 🪟 🐧
npm run
npx
Docker
Podman - -
Seatbelt - -

Note: Primary testing done on macOS with npm run and npx.

Linked issues / bugs

This PR implements a comprehensive Stream JSON migration and non-interactive framework refactoring originally proposed in #810

Development Context:

Phase 0 (commits task1-2-2 through task1-2-5): Initial stream-json implementation

  • Stream-json format support in packages/cli/src/streamJson/
  • StreamJsonWriter and controller
  • Session management and control requests

Phase 1 (this PR): Non-Interactive Framework Architecture

  • New nonInteractive/ directory with comprehensive framework
  • JSON adapter architecture (BaseJsonOutputAdapter, JsonOutputAdapter, StreamJsonOutputAdapter)
  • Session management with state machine
  • Control plane infrastructure (under construction, SDK-relevant)
  • Protocol type definitions in nonInteractive/types.ts
  • Extensive test coverage
  • All known issues resolved

Future Phases:

  • SDK development (moved to separate branch)
  • Documentation and examples
  • Behavior comparison documentation

feat: enhance build process and update .gitignore for Python caches
feat: add support for stream-json format and includePartialMessages flag in CLI arguments
feat: add StreamJsonWriter and associated types for structured JSON streaming
feat: implement stream-json session handling and control requests
Implement control request handling and refactor related functions

- Added `handleIncomingControlRequest` method to `StreamJsonController` for processing control requests.
- Created `input.test.ts` and `session.test.ts` to test control request handling.
- Refactored `runStreamJsonSession` to delegate control requests to the controller.
- Moved `extractUserMessageText` and `writeStreamJsonEnvelope` to a new `io.ts` file for better organization.
- Updated tests to ensure proper functionality of control responses and message extraction.
Add user envelope handling in runNonInteractive function
Add tests for runStreamJsonSession and enhance session handling

- Implement tests for runStreamJsonSession to validate user prompts and message handling.
- Improve session termination logic to ensure all active runs are awaited.
- Log user prompts with additional metadata for better tracking.
chore: update .gitignore to remove Python cache entries
@x22x22 x22x22 force-pushed the feature/stream-json-migration branch from ea0fb43 to 567b73e Compare October 30, 2025 10:03
@Mingholy Mingholy changed the title Feature/stream json migration Headless enhancement: add stream-json as input-format/output-format to support programmatically use Oct 30, 2025
@Mingholy Mingholy marked this pull request as draft October 30, 2025 11:05
@Mingholy Mingholy marked this pull request as ready for review November 6, 2025 08:12
@Mingholy Mingholy requested a review from tanzhenxin November 6, 2025 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants