Feature/qa automation agent by Ashraf-Khabar · Pull Request #6840 · aden-hive/hive

Ashraf-Khabar · 2026-03-28T12:33:15Z

Description

Added a new qa_engineer_agent template to the framework. This advanced agent is designed to handle full-project Quality Assurance by executing automated test suites (via CLI) and performing exploratory UI testing (via a GCU browser sub-agent). It orchestrates the testing process, analyzes stdout/stderr logs, evaluates UI states, and generates a comprehensive QA report.

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactoring (no functional changes)

Related Issues

Fixes #6732

Changes Made

Created the qa_engineer_agent package under examples/templates/.
Implemented agent.py defining the graph workflow and success criteria for QA testing.
Created nodes/__init__.py with 4 distinct nodes:
- planning (event_loop): To formulate the testing strategy.
- test_execution (event_loop): To execute automated tests using execute_command_tool and delegate UI tasks.
- ui_testing (gcu): A sub-agent leveraging Playwright tools for exploratory visual testing.
- reporting (event_loop): To compile results and interact with the user.
Configured mcp_servers.json to load the bash executor and GCU browser tools.
Designed the flowchart.json to accurately visualize the agent's graph and available tools in the Hive UI.
Added __main__.py to provide a CLI entry point for testing the agent directly.

Testing

Describe the tests you ran to verify your changes:

Manual testing performed (Ran the agent via python -m examples.templates.qa_engineer_agent and verified the graph loads perfectly in the Hive Dashboard).

Checklist

My code follows the project's style guidelines
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
New and existing unit tests pass locally with my changes

Summary by CodeRabbit

Release Notes

New Features
- Added QA Engineer Agent template with a coordinated workflow for automated testing and quality assurance.
- Supports test planning, test execution, UI testing, and report generation as an integrated multi-step process.
- Includes MCP server integration providing access to filesystem operations, command execution, and browser automation capabilities.
- Agent can be executed directly as a standalone module.

…xecution

coderabbitai · 2026-03-28T12:34:13Z

📝 Walkthrough

Walkthrough

This PR introduces a complete QA Engineer Agent template featuring a graph-based workflow with four orchestrated nodes (planning, execution, UI testing, reporting) that automate test execution and report generation, supported by configuration files and MCP server definitions for tool integration.

Changes

Cohort / File(s)	Summary
Package Infrastructure `examples/templates/qa_engineer_agent/__init__.py`, `examples/templates/qa_engineer_agent/__main__.py`	Package initialization with public exports and async entry-point script that configures logging, runs the agent with test context, and outputs execution results.
Core Agent Logic `examples/templates/qa_engineer_agent/agent.py`	Implements `QaEngineerAgent` class managing graph-based workflow with four nodes, runtime initialization (LLM provider, tool registry, storage), and async lifecycle methods (`start`, `stop`, `run`) for orchestrating the QA workflow.
Configuration & Metadata `examples/templates/qa_engineer_agent/config.py`, `examples/templates/qa_engineer_agent/mcp_servers.json`	Defines agent metadata (name, version, description) and MCP server launch configurations for filesystem, bash executor, and browser tools.
Workflow Definition `examples/templates/qa_engineer_agent/flowchart.json`	JSON template specifying workflow structure with four nodes (planning, test_execution, ui_testing, reporting), edges, entry/terminal nodes, and presentation metadata.
Node Implementations `examples/templates/qa_engineer_agent/nodes/__init__.py`	Defines four `NodeSpec` instances: planning (analyzes request), execution (runs tests, delegates UI testing), ui_testing (subagent for browser verification), and reporting (compiles results).

Sequence Diagram

sequenceDiagram
    participant User as User/Main
    participant Agent as QaEngineerAgent
    participant Runtime as GraphRuntime
    participant Planning as Planning Node
    participant Execution as Execution Node
    participant UiTest as UI Testing Subagent
    participant Reporting as Reporting Node
    participant LLM as LLM Provider

    User->>Agent: run(context)
    Agent->>Agent: start()
    Agent->>Runtime: create_agent_runtime()
    activate Runtime
    
    User->>Runtime: trigger(default entry point)
    Runtime->>Planning: execute
    activate Planning
    Planning->>LLM: analyze request, create test plan
    LLM-->>Planning: test plan
    Planning-->>Runtime: success
    deactivate Planning
    
    Runtime->>Execution: execute (on_success)
    activate Execution
    Execution->>LLM: run planned tests
    LLM-->>Execution: test results
    Execution->>UiTest: delegate UI testing (sub_agents)
    activate UiTest
    UiTest->>LLM: exploratory UI verification
    LLM-->>UiTest: UI findings
    UiTest-->>Execution: UI results
    deactivate UiTest
    Execution-->>Runtime: success
    deactivate Execution
    
    Runtime->>Reporting: execute (on_success)
    activate Reporting
    Reporting->>LLM: compile results + UI findings
    LLM-->>Reporting: final QA report
    Reporting-->>Runtime: success
    deactivate Reporting
    
    Runtime-->>Agent: ExecutionResult
    deactivate Runtime
    Agent->>Agent: stop()
    Agent-->>User: result

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 Hops with glee at tests now planned!
Four nodes working hand-in-paw across the land,
Planning, executing, UI springs to view,
Reports compiled with LLM's helping brew—
A QA agent born, thorough and true! 🧪✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Linked Issues check	⚠️ Warning	The PR only partially addresses issue `#6732`; while it creates an agent template for QA testing, it lacks explicit implementations of Playwright, Selenium, or Robot Framework as core toolkit requirements.	Implement or explicitly integrate Playwright, Selenium, and Robot Framework as specified in issue `#6732` requirements, or clarify why the agent template approach satisfies the toolkit scope.
Title check	❓ Inconclusive	The title 'Feature/qa automation agent' is vague and generic, using non-descriptive terms that don't clearly convey the specific changes introduced.	Provide a more descriptive title that specifies the main change, such as 'Add QA Engineer Agent template for automated testing and UI verification'.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Out of Scope Changes check	✅ Passed	All changes directly relate to implementing a QA Engineer Agent template aligned with the PR objectives; no out-of-scope modifications detected.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

examples/templates/qa_engineer_agent/agent.py (2)

176-182: Consider adding timeout configuration.

The run method calls trigger_and_wait without an explicit timeout. If the execution hangs, this could block indefinitely. Consider making the timeout configurable.

Suggested enhancement

-    async def run(self, context: dict, mock_mode=False, session_state=None) -> ExecutionResult:
+    async def run(self, context: dict, mock_mode=False, session_state=None, timeout: float | None = None) -> ExecutionResult:
         await self.start(mock_mode=mock_mode)
         try:
-            result = await self._agent_runtime.trigger_and_wait("default", context, session_state=session_state)
+            result = await self._agent_runtime.trigger_and_wait("default", context, session_state=session_state, timeout=timeout)
             return result or ExecutionResult(success=False, error="Execution timeout")
         finally:
             await self.stop()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@examples/templates/qa_engineer_agent/agent.py` around lines 176 - 182, Add a
configurable timeout to the run method so trigger_and_wait cannot hang
indefinitely: update the async def run(self, context: dict, mock_mode=False,
session_state=None) -> ExecutionResult signature to accept a timeout parameter
(e.g. timeout: Optional[float] = DEFAULT_TIMEOUT), validate or default it, then
pass that timeout into self._agent_runtime.trigger_and_wait("default", context,
session_state=session_state, timeout=timeout). Ensure you handle a timeout
result or timeout exception by returning ExecutionResult(success=False,
error="Execution timeout") and still call await self.stop() in the finally
block; reference run, _agent_runtime.trigger_and_wait, ExecutionResult and stop
when making the changes.

75-82: Consider Python-idiomatic comparison in condition expression.

The condition expression uses needs_more_testing == True. While this is a string that may be evaluated by the framework, the more Pythonic pattern would be needs_more_testing is True or simply needs_more_testing.

Suggested change

     EdgeSpec(
         id="report-to-plan",
         source="reporting",
         target="planning",
         condition=EdgeCondition.CONDITIONAL,
-        condition_expr="needs_more_testing == True",
+        condition_expr="needs_more_testing",
         priority=1,
     ),

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@examples/templates/qa_engineer_agent/agent.py` around lines 75 - 82, The
condition expression for EdgeSpec with id "report-to-plan" is using the
non-idiomatic "needs_more_testing == True"; update the EdgeSpec's condition_expr
to a more Pythonic expression such as "needs_more_testing" (or
"needs_more_testing is True" if explicit identity is required) so the framework
evaluates it in a Pythonic way; locate the EdgeSpec instance
(id="report-to-plan") and replace the condition_expr string accordingly.

examples/templates/qa_engineer_agent/__init__.py (1)

3-7: Use English for code comments.

The comments are in French but the rest of the codebase appears to use English. For consistency, consider translating:

Line 3: "We import variables from agent.py and config.py so Hive can find them"
Line 7: "We tell Python what is publicly exposed"

Suggested fix

-# On importe les variables depuis agent.py et config.py pour que Hive les trouve
+# Import variables from agent.py and config.py for Hive discovery
 from .agent import default_agent, edges, goal, nodes
 from .config import metadata

-# On indique à Python ce qui est exposé publiquement
+# Define public API exports
 __all__ = [

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@examples/templates/qa_engineer_agent/__init__.py` around lines 3 - 7,
Translate the French comments in __init__.py to English to match the codebase:
change the comment above the imports to something like "Import variables from
agent.py and config.py so Hive can find them" and change the comment below to
"Declare what is publicly exposed to Python"; keep the existing imports
(default_agent, edges, goal, nodes) and metadata unchanged so behavior is
identical.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/templates/qa_engineer_agent/mcp_servers.json`:
- Around line 11-14: The browser MCP server config uses the wrong invocation in
the "browser" object: replace the args array currently set to ["-m", "gcu",
"serve", "browser"] with the canonical MCP server invocation described in the
docs so the "browser" command runs the gcu.server in stdio mode; specifically
update the "browser" object's args to the documented invocation (use the "run"
style invocation that launches python -m gcu.server with --stdio) so MCP
communication is correct.

---

Nitpick comments:
In `@examples/templates/qa_engineer_agent/__init__.py`:
- Around line 3-7: Translate the French comments in __init__.py to English to
match the codebase: change the comment above the imports to something like
"Import variables from agent.py and config.py so Hive can find them" and change
the comment below to "Declare what is publicly exposed to Python"; keep the
existing imports (default_agent, edges, goal, nodes) and metadata unchanged so
behavior is identical.

In `@examples/templates/qa_engineer_agent/agent.py`:
- Around line 176-182: Add a configurable timeout to the run method so
trigger_and_wait cannot hang indefinitely: update the async def run(self,
context: dict, mock_mode=False, session_state=None) -> ExecutionResult signature
to accept a timeout parameter (e.g. timeout: Optional[float] = DEFAULT_TIMEOUT),
validate or default it, then pass that timeout into
self._agent_runtime.trigger_and_wait("default", context,
session_state=session_state, timeout=timeout). Ensure you handle a timeout
result or timeout exception by returning ExecutionResult(success=False,
error="Execution timeout") and still call await self.stop() in the finally
block; reference run, _agent_runtime.trigger_and_wait, ExecutionResult and stop
when making the changes.
- Around line 75-82: The condition expression for EdgeSpec with id
"report-to-plan" is using the non-idiomatic "needs_more_testing == True"; update
the EdgeSpec's condition_expr to a more Pythonic expression such as
"needs_more_testing" (or "needs_more_testing is True" if explicit identity is
required) so the framework evaluates it in a Pythonic way; locate the EdgeSpec
instance (id="report-to-plan") and replace the condition_expr string
accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 05a11ce4-4dcd-40a6-98dd-ba2285d778e5

📥 Commits

Reviewing files that changed from the base of the PR and between c3c3075 and db00b45.

📒 Files selected for processing (7)

examples/templates/qa_engineer_agent/__init__.py
examples/templates/qa_engineer_agent/__main__.py
examples/templates/qa_engineer_agent/agent.py
examples/templates/qa_engineer_agent/config.py
examples/templates/qa_engineer_agent/flowchart.json
examples/templates/qa_engineer_agent/mcp_servers.json
examples/templates/qa_engineer_agent/nodes/__init__.py

coderabbitai · 2026-03-28T12:45:20Z

examples/templates/qa_engineer_agent/mcp_servers.json

+    "browser": {
+      "command": "python",
+      "args": ["-m", "gcu", "serve", "browser"]
+    }


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check gcu module structure and available entry points fd -t f '__main__.py' --full-path '.*gcu.*' --exec cat {} # Check if gcu has a 'serve' subcommand or CLI entry point rg -n "serve|browser" --type py -g '*gcu*'

Repository: aden-hive/hive

Length of output: 10702

Fix incorrect GCU MCP server invocation in browser configuration.

The browser server uses an incorrect invocation pattern. The correct pattern, as documented in core/framework/graph/gcu.py and core/framework/agents/queen/reference/gcu_guide.md, is:

"args": ["run", "python", "-m", "gcu.server", "--stdio"]

Replace ["-m", "gcu", "serve", "browser"] with the canonical invocation above to ensure proper MCP server communication.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/templates/qa_engineer_agent/mcp_servers.json` around lines 11 - 14, The browser MCP server config uses the wrong invocation in the "browser" object: replace the args array currently set to ["-m", "gcu", "serve", "browser"] with the canonical MCP server invocation described in the docs so the "browser" command runs the gcu.server in stdio mode; specifically update the "browser" object's args to the documented invocation (use the "run" style invocation that launches python -m gcu.server with --stdio) so MCP communication is correct.

Ashraf-Khabar added 2 commits March 28, 2026 13:09

feat(agent): add QA Engineer agent with browser automation and test e…

7c8ecbe

…xecution

feat(agent): add QA Engineer agent with browser automation and test e…

db00b45

…xecution

coderabbitai bot reviewed Mar 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/qa automation agent#6840

Feature/qa automation agent#6840
Ashraf-Khabar wants to merge 2 commits intoaden-hive:mainfrom
Ashraf-Khabar:feature/qa-automation-agent

Ashraf-Khabar commented Mar 28, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 28, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ashraf-Khabar commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Changes Made

Testing

Checklist

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Ashraf-Khabar commented Mar 28, 2026 •

edited

Loading

coderabbitai bot commented Mar 28, 2026 •

edited

Loading