Skip to content

Conversation

@jrepp
Copy link
Owner

@jrepp jrepp commented Nov 20, 2025

Why

The prism-admin gRPC control plane handles Raft consensus, state management, and cluster coordination. However, browser-based administration UI has different concerns: user interactions, HTML rendering, and session management. Separating these allows:

  • Independent scaling: Control plane scales based on cluster size; web console scales based on admin users
  • Clean architecture: No UI code polluting the control plane
  • Technology fit: Web console can use htmx/templ without affecting core gRPC APIs

How

Created cmd/prism-web-console/ as a standalone Go HTTP server using Gin that:

  • Connects to prism-admin via gRPC for backend operations
  • Serves web UI for namespace management, session monitoring, backend health
  • Integrates with prism-launcher for lifecycle management via --admin flag
  • Added to prismctl local start workflow

What Changed

  • cmd/prism-web-console/ - New Go web console with Gin HTTP server, handlers, middleware, static assets
  • cmd/prismctl/cmd/local.go - Added prism-web-console to local stack startup
  • pkg/launcher/local_stack.go - Launcher integration for web console
  • docs-cms/adr/adr-060-web-console-separation.md - Architecture decision documenting the separation
  • Taskfile.yml - Build target for prism-web-console
  • docusaurus/docs/changelog.md - Updated changelog

Test Plan

  • Build succeeds: task prism-web-console
  • Local stack starts with web console: prismctl local start
  • Web console health check responds: curl localhost:8000/health
  • Documentation validates: uv run tooling/validate_docs.py

Co-Authored-By: Claude [email protected]

jrepp and others added 11 commits October 18, 2025 17:19
User request: "let's work on backend modeling for the patterns, we want to create a flat set of backends that have unique names and are known to the admin and shared to everyone in the control plane, it will be neccessary for the pattern runners to have access to the backend configuration on startup so they can map their slot implementation details to specific backend configurations"

Created comprehensive RFC for centralized backend configuration management:

## Key Design Decisions

**Flat Backend Registry**:
- All backends have globally unique names (e.g., kafka-prod, postgres-primary)
- Shared across all namespaces and patterns
- Eliminates config duplication and enables central management

**Admin-Managed with Raft**:
- Backends stored in admin FSM state (replicated via Raft)
- New admin commands: REGISTER_BACKEND, UPDATE_BACKEND, DELETE_BACKEND
- Synced to local storage (SQLite/PostgreSQL) on each admin node

**Pattern Slot Binding**:
- Patterns declare slot_bindings: {registry: "postgres-primary", messaging: "kafka-prod"}
- Pattern runners fetch backend configs from admin at startup
- SlotBinder utility creates type-specific slot implementations

**Type-Specific Configs**:
- BackendType enum: KAFKA, NATS, POSTGRES, REDIS, SQLITE, S3, MEMSTORE
- Structured configs per type (KafkaConfig, PostgresConfig, etc.)
- Credentials, connection pooling, timeouts all captured

## Example Flow

1. Operator registers backend:
   `prism-admin backend register kafka-prod --brokers kafka:9092`

2. Pattern references backend:
   ```yaml
   namespace: order-processing
   pattern: multicast-registry
   slot_bindings:
     registry: postgres-primary
     messaging: kafka-prod
   ```

3. Pattern runner binds slots:
   - Fetches kafka-prod config from admin
   - Creates KafkaMessagingSlot with connection details
   - Connects to Kafka and starts processing

## Benefits

- **DRY**: One backend config used by multiple patterns
- **Centralized ops**: Change Kafka URL once, all patterns update
- **Separation of concerns**: Pattern authors don't need connection details
- **Type safety**: Structured configs with validation
- **Observability**: Admin knows which patterns use which backends

## Implementation Plan

6-phase rollout over 4 weeks:
1. Protobuf definitions
2. Admin FSM integration
3. Admin API implementation
4. Pattern runner integration
5. Testing
6. Documentation

## Open Questions

- Secret management (proposed: integrate Vault/K8s Secrets)
- Backend versioning and hot-reload (proposed: require restart initially)
- Multi-region backends (proposed: separate entries per region)
- Health monitoring (proposed: Phase 2 feature)

Builds on RFC-014 (layered patterns), RFC-017 (multicast registry slots),
RFC-035 (pattern launcher), RFC-038 (admin raft), and MEMO-004 (backend guide).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
…ration

User request: "pull in slot configuration of pattern implementations - we also want to define a type of config similar to backend with is a frontend, a frontend is a type of interface binding on the proxy, the default is the grpc pattern interface, this happens by default but can be disabled, additional itnerfaces can be added based on a front end definition - we should use the openapi semantics so that if we define a rest based front end interface that maps for example the mcp rest interface we can then bind it to specific pattern interfaces with some route config that can be consumed by the proxy to map rest interfaces to patterns on the backend, a concrete example is to expose a registry pattern as a confluent schema registry api"

Major additions to RFC-039:

1. Frontend Interface Binding Model (parallel to Backend):
   - Frontend resource with globally unique names
   - FrontendType enum: REST, GraphQL, gRPC-Web, SSE, WebSocket
   - Type-specific configs (RestConfig, GraphQLConfig, etc.)
   - RouteMapping for OpenAPI-style REST → gRPC mapping
   - ParamMapping: path/query/header/body → protobuf field mapping
   - ResponseMapping: protobuf → HTTP response transformation

2. Admin State Integration:
   - FrontendEntry in AdminState (Raft-replicated)
   - Frontend management commands (Register, Update, Delete)
   - Frontend management RPCs in ControlPlane service
   - Storage sync to persist frontends

3. Concrete Example: Registry Pattern as Confluent Schema Registry API:
   - Complete route mappings for Confluent REST API
   - POST /subjects/{subject}/versions → RegisterSchema gRPC
   - GET /subjects/{subject}/versions/{version} → GetSchema gRPC
   - POST /compatibility/... → CheckCompatibility gRPC
   - DELETE /subjects/{subject}/versions/{version} → DeleteSchema gRPC
   - Full sequence diagram showing HTTP → gRPC translation
   - Python client example using Confluent SDK with Prism backend
   - Benefits: API compatibility, backend flexibility, protocol translation

4. Pattern Slot Schema Integration (MEMO-006):
   - Slot definitions with required/optional interfaces
   - Runtime validation: backend must implement required interfaces
   - Backend capability metadata (keyvalue_basic, pubsub_basic, etc.)
   - SlotBinder validates interface requirements at pattern startup
   - 45 thin interfaces across 10 data models (per MEMO-006)

5. Namespace Configuration Extensions:
   - FrontendBinding message for namespace opt-in
   - Default gRPC interface (can be disabled)
   - Multiple frontends per namespace
   - Namespace-specific overrides

6. Expanded Implementation Plan (8 phases, 5 weeks):
   - Phase 1: Protobuf definitions for both backend and frontend
   - Phase 2: Admin FSM integration for both registries
   - Phase 3: Admin API implementation for both
   - Phase 4: Pattern runner slot binding with schema validation
   - Phase 5: Proxy frontend integration (REST adapter, route matching)
   - Phase 6: Confluent Schema Registry concrete example
   - Phase 7: Comprehensive testing (backend + frontend)
   - Phase 8: Documentation for operators

Key design principles:
- Parallel architecture: Frontends to Backends (same admin management)
- OpenAPI semantics for route mapping (not full codegen)
- Protocol translation at proxy layer (HTTP → gRPC)
- Centralized admin management for both registries
- Default gRPC + optional additional interfaces

References added:
- RFC-020 (HTTP adapter pattern)
- RFC-032 (Confluent API compatibility)
- MEMO-006 (interface decomposition, slot schemas)
- Confluent Schema Registry API documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
User request: "the PR is waiting for status check but it's not available for this change, for document only changes can we run the document verification, lint, build and update the CI status?"

Problem: CI workflow has paths-ignore for docs-cms/** and *.md files, so documentation-only PRs don't trigger any status checks, leaving PRs without validation.

Solution: Created dedicated docs-pr.yml workflow that:
- Triggers on PRs with documentation changes only
- Runs uv run tooling/validate_docs.py (validates frontmatter, links, MDX)
- Runs uv run tooling/build_docs.py (builds Docusaurus site)
- Provides docs-status check for PR merge requirements
- Uses concurrency groups to cancel stale runs

Benefits:
- Documentation PRs now get status checks
- Validates MDX compilation before merge
- Catches broken links and invalid frontmatter
- Prevents GitHub Pages build failures
- Independent from main CI workflow (doesn't run code tests for doc changes)

Workflow triggers on:
- docs-cms/** (ADRs, RFCs, MEMOs)
- docusaurus/** (Docusaurus config)
- **/*.md (all markdown files)
- tooling/validate_docs.py, tooling/build_docs.py
- .github/workflows/docs-pr.yml (self-test)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
User request: "create a new feature branch to create a rfc that defines client sdks the sdks will target having full integration test coverage and a shared directory structure. there will be 3 sdks rust, python and go to start - we will use best practices for async client apis and use the grpc interfaces directly - we want to expose the pattern interfaces as directly usable apis. to start a client we should use oauth as the default auth method - clients will need to support namespace configuration within the limited set of configuration options available for each pattern - start with producer, consumer and key-value patterns"

Created comprehensive RFC-040 defining client SDK architecture for Rust, Python, and Go:

**Architecture**:
- Pattern-centric APIs: Producer, Consumer, KeyValue as first-class APIs
- Async-first design: tokio (Rust), asyncio (Python), goroutines (Go)
- Direct gRPC communication with Prism proxy for maximum performance
- OAuth2 client credentials flow as default authentication
- Namespace-aware configuration with per-pattern options

**Pattern APIs**:
- Producer: publish(), publish_batch(), flush()
- Consumer: subscribe() with streaming API, ack(), nack()
- KeyValue: get(), set(), delete(), exists()

**Testing Strategy**:
- Full integration test coverage using testcontainers
- Real Prism proxy + backends (Redis, Kafka, NATS) for tests
- Target coverage: Producer/Consumer 85%, OAuth2 90%
- Performance benchmarks: >10k msg/sec producer, <1ms KeyValue p99

**Shared Directory Structure**:
- Consistent layout across all three languages
- src/patterns/ for pattern implementations
- src/auth/ for OAuth2 client
- tests/{unit,integration,e2e}/ for test suites
- examples/ for usage examples

**Configuration**:
- Unified YAML format across all SDKs
- OAuth2 with token caching and automatic refresh
- Per-namespace pattern-specific options
- Built-in observability (Prometheus, OpenTelemetry, structured logging)

**Implementation Roadmap**:
- Phase 1 (Week 1-2): Protobuf code gen, Client factory, OAuth2
- Phase 2 (Week 3-4): Producer, Consumer, KeyValue implementations
- Phase 3 (Week 5-6): Integration tests with testcontainers
- Phase 4 (Week 7-8): Observability, documentation, benchmarks

Updated changelog with RFC-040 summary and key features.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
User request: "update the pr creation command to keep the pr description short, readable and focused on quickly answering why, how and what for reviewers"

Updated .claude/commands/submit-pr.md to enforce concise PR descriptions:

**New Structure** (replaces verbose multi-section format):
- Why: 1-2 sentences on problem/value
- How: 2-4 bullets on implementation approach
- What Changed: 2-4 bullets on measurable impact
- Testing: Simple checklist
- Target: 10-15 lines total (max 20)

**Key Changes**:
- Remove file paths (reviewers see diffs)
- Remove implementation details (code review is for that)
- Remove excessive checklists (Breaking Changes, Dependencies)
- Focus on architecture/approach, not line-by-line changes
- Quantify impact when possible

**Writing Guidelines**:
- Why: Problem/value, not process
- How: Architecture only, not filenames
- What: Impact, not file changes

**Example Good PR** (12 lines):
```
## Why
RFC-040 requires client SDKs to reduce integration friction.

## How
- Define pattern-centric APIs
- OAuth2 auth flow
- Testcontainers for integration tests

## What Changed
- Add RFC-040 with 3-language SDK spec
- Define shared directory structure
- Specify 85% coverage targets

## Testing
- [x] Documentation validation passes
```

This replaces the previous 6-section format that encouraged verbose descriptions with file paths and implementation details.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
User request: "create a new branch to cleanup the output of the prism local start to be more clear about the prismctl instructions that can be used as well as the web console addresses from the prism admin"

Fixed misleading output that showed non-existent HTTP endpoints:
- Removed fake admin "Web UI" ports (8970-8972) - these are not implemented
- Removed fake proxy "Admin API" and "Metrics" - proxy only runs gRPC data plane
- Corrected to show only actual HTTP endpoints: admin metrics (9090-9092/metrics)

Added comprehensive quick-start command reference:
- Namespace management commands (create, list, get)
- Monitoring and debugging commands (status, logs, health)
- KeyValue and Mailbox operation examples
- Clear web endpoints section with actual URLs

Output now accurately reflects what's actually running and provides
copy-paste commands for common developer workflows.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
User request: "create a feature branch to separate the web admin console out of prism-admin into a separate go process `prism-web-console`"

Created `cmd/prism-web-console/` with foundational structure:
- Gin HTTP server with health check and placeholder routes
- Handlers for namespace CRUD, session monitoring, backend health
- Middleware stubs for OIDC authentication and request logging
- Static assets directory for CSS/JS (htmx integration planned)
- Auto-discovered by Makefile (builds via `make prism-web-console`)

Architecture decision (ADR-060):
- Separation of concerns: Control plane (prism-admin) handles Raft
  consensus, state management, cluster coordination. Web console handles
  UI rendering, user interactions, browser-based administration.
- Independent scaling: Control plane scales by cluster size. Web console
  scales by number of admin users.
- Technology stack: Following RFC-036's Go-based approach with Gin, templ,
  htmx, Tailwind CSS (vs ADR-028's Python/FastAPI alternative)
- Communication: Web console → prism-admin via gRPC (:8000 HTTP → :8981 gRPC)

Benefits:
- Clean architecture with clear separation
- Language consolidation (all admin tooling in Go)
- Type safety via templ compile-time validation
- 20-40x faster startup (<50ms vs 1-2s Python)
- Smaller containers (20-30MB vs 100-150MB)

Updated CHANGELOG with comprehensive entry documenting separation rationale,
implementation phases, and references to RFC-036 and RFC-038.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
User request: "update the shared changelog folding CHANGELOG into it the shared changelog is in the ./docusaurus directory"

Added comprehensive entry for Prism Kubernetes Operator unreleased changes:
- KEDA Integration: Full support for 60+ KEDA scalers (Kafka, RabbitMQ,
  NATS, SQS, Redis, PostgreSQL) with optional installer script
- Enhanced Status Tracking: Three-phase lifecycle (Pending → Progressing →
  Running) with replica counts, Kubernetes Conditions, ObservedGeneration
- Graceful Degradation: Operator functions without KEDA, INFO-level logging
  for missing dependencies, automatic HPA/KEDA switching
- Installation Improvements: Production-ready installer with verification,
  Docker Desktop TLS patch, Makefile targets
- Documentation: QUICK_START.md, KEDA_INSTALL_GUIDE.md, TEST_REPORT.md
- Architecture: Separated HPA/KEDA reconcilers, enhanced logging

Consolidated content from prism-operator/CHANGELOG.md [Unreleased] section
into docusaurus/docs/changelog.md (2025-10-22 date) while maintaining link
to operator-specific CHANGELOG for detailed tracking.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
User request: "add prism-web-console to ./prismctl local start and print it's address to the command line as the last thing as a clickable link, validate that it is connecting to the admin processes and being managed by the prism-launcher"

Integrated prism-web-console into local development stack:

cmd/prism-web-console/main.go:
- Added Cobra CLI with --admin-endpoint, --port, --admin, --process-id flags
- Added launcher integration support (--admin=launcher://host:port)
- Enhanced startup banner showing admin endpoint and listen address
- Added structured logging with slog

pkg/launcher/local_stack.go:
- Added WebConsolePort field to LocalStackConfig (default: 8000)
- Implemented startWebConsole() method (Phase 4 of startup)
- Launches web console with launcher callback integration
- Added clickable link output as final message using OSC-8 hyperlink
  (terminal protocol for clickable URLs)

cmd/prismctl/cmd/local.go:
- Updated local start documentation (10 startup phases)
- Added prism-web-console to stop sequence (web console → runners → launcher)
- Added prism-web-console to status checks
- Added web-console to logs command with "prism-web-console.log" mapping
- Updated help text to include web console in stack description

Validation completed:
✅ Web console starts in Phase 4 after pattern runners
✅ Connected to admin endpoint (localhost:8981)
✅ Managed by prism-launcher (PID tracking, lifecycle management)
✅ Health endpoint responds: {"service":"prism-web-console","status":"healthy"}
✅ Clickable link displayed: http://localhost:8000 (OSC-8 hyperlink)
✅ Included in prismctl local stop (graceful shutdown)
✅ Included in prismctl local status (process monitoring)

Startup output now ends with prominent web console section:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🌐 WEB ADMIN CONSOLE
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  http://localhost:8000  (clickable in modern terminals)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
User request: "make sure prism-web-console is in the mix for task build"

Added prism-web-console to Taskfile.yml build system:
- Added to build-cmds dependencies for `task build`
- Created individual prism-web-console build task with change detection
- Added to dev task for debug builds

Verified integration:
- Successfully builds 18MB binary to build/binaries/prism-web-console
- Launches correctly with prismctl local start (Phase 4)
- Serves health endpoint (200 OK) and API endpoints
- Stops gracefully with prismctl local stop

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Copilot AI review requested due to automatic review settings November 20, 2025 22:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements the separation of the web console from the admin control plane by creating a new prism-web-console service as a standalone Go process. The implementation follows RFC-036's Go-based approach using Gin, with plans for templ templates and htmx integration.

Key Changes:

  • Created prism-web-console as a separate Go service with basic HTTP server, health check, and placeholder handlers
  • Integrated web console into local stack startup sequence as Phase 4
  • Updated documentation with ADR-060 and comprehensive changelog entries

Reviewed Changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/launcher/local_stack.go Added web console configuration, startup logic, and status display
docusaurus/docs/changelog.md Documented web console separation and KEDA operator enhancements
docs-cms/adr/adr-060-web-console-separation.md Architecture decision record for separating web console from control plane
cmd/prismctl/cmd/local.go Updated local stack commands to include web console management
cmd/prism-web-console/static/css/styles.css Basic CSS styling for web console UI
cmd/prism-web-console/middleware/logging.go HTTP request logging middleware
cmd/prism-web-console/middleware/auth.go OIDC authentication middleware stubs
cmd/prism-web-console/main.go Main entry point with Gin server setup and configuration
cmd/prism-web-console/handlers/namespace.go Placeholder namespace CRUD handlers
cmd/prism-web-console/handlers/health.go Health check and dashboard handlers
cmd/prism-web-console/go.mod Go module dependencies for web console
cmd/prism-web-console/README.md Comprehensive documentation for web console
Taskfile.yml Build task for prism-web-console

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mergify mergify bot added documentation Improvements or additions to documentation infrastructure go Pull requests that update go code size/m labels Nov 20, 2025
@mergify
Copy link
Contributor

mergify bot commented Nov 20, 2025

This PR has merge conflicts with the base branch. Please resolve them.

@mergify
Copy link
Contributor

mergify bot commented Dec 4, 2025

This PR has been inactive for 14 days. Please update it or close it if it's no longer needed.

@mergify mergify bot added stale and removed stale labels Dec 4, 2025
@mergify
Copy link
Contributor

mergify bot commented Dec 18, 2025

This PR has been inactive for 14 days. Please update it or close it if it's no longer needed.

@mergify mergify bot added stale and removed stale labels Dec 18, 2025
@mergify
Copy link
Contributor

mergify bot commented Jan 1, 2026

This PR has been inactive for 14 days. Please update it or close it if it's no longer needed.

@mergify mergify bot added stale and removed stale labels Jan 1, 2026
@jrepp jrepp changed the title Feature - Separate Web Console Add prism-web-console as separate Go process with launcher integration Jan 3, 2026
@mergify mergify bot added size/xs and removed size/m labels Jan 3, 2026
@jrepp
Copy link
Owner Author

jrepp commented Jan 3, 2026

Addressing reviewer comments about future dates (2025-10-22):

The comments were created on 2025-11-20 when October 2025 was considered "future". Since it is now January 2026, these dates are valid historical dates. The changelog has also been restructured during the merge with main - web console separation content is now properly integrated.

These comment threads can be resolved as outdated.

@mergify mergify bot removed the has-conflicts label Jan 3, 2026
@jrepp jrepp added the enhancement New feature or request label Jan 3, 2026
jrepp and others added 2 commits January 2, 2026 21:48
User request: "merge PR 191 into 190 and close 191 to include the RFC with the client implementation"

Co-Authored-By: Claude <[email protected]>
User request: "land-pr 190"

Co-Authored-By: Claude <[email protected]>
@mergify
Copy link
Contributor

mergify bot commented Jan 3, 2026

The PR Status Check has failed. Please review the CI logs and fix any issues.

Common issues:

  • Test failures
  • Linting errors
  • Documentation validation failures

You can run checks locally:

task test-parallel-fast  # Run tests
task lint-parallel       # Run linters
uv run tooling/validate_docs.py  # Validate docs

@mergify
Copy link
Contributor

mergify bot commented Jan 3, 2026

🧪 CI Insights

Here's what we observed from your CI run for bc0b92f.

❌ Job Failures

Pipeline Job Health on main Retries 🔍 CI Insights 📄 Logs
PR Status Check PR Status Check Unknown 0 View View

@mergify
Copy link
Contributor

mergify bot commented Jan 3, 2026

The PR Status Check has failed. Please review the CI logs and fix any issues.

Common issues:

  • Test failures
  • Linting errors
  • Documentation validation failures

You can run checks locally:

task test-parallel-fast  # Run tests
task lint-parallel       # Run linters
uv run tooling/validate_docs.py  # Validate docs

@mergify
Copy link
Contributor

mergify bot commented Jan 5, 2026

The PR Status Check has failed. Please review the CI logs and fix any issues.

Common issues:

  • Test failures
  • Linting errors
  • Documentation validation failures

You can run checks locally:

task test-parallel-fast  # Run tests
task lint-parallel       # Run linters
uv run tooling/validate_docs.py  # Validate docs

@mergify
Copy link
Contributor

mergify bot commented Jan 6, 2026

The PR Status Check has failed. Please review the CI logs and fix any issues.

Common issues:

  • Test failures
  • Linting errors
  • Documentation validation failures

You can run checks locally:

task test-parallel-fast  # Run tests
task lint-parallel       # Run linters
uv run tooling/validate_docs.py  # Validate docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request go Pull requests that update go code infrastructure size/xs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants