Improve Cross-Repository Evaluation Workflow Orchestration

# Improve Cross-Repository Workflow Orchestration: Replace Dispatch/Poll with Native workflow_call

## 🎯 Goal
Replace the current inefficient "dispatch + poll" pattern between repositories with GitHub Actions' native `workflow_call` mechanism to eliminate polling overhead, reduce race conditions, and create explicit dependency chains.

## 📋 Current Problems

### 1. **Inefficient Polling (80+ minutes overhead)**
- `run-eval.yml` polls benchmarks build for up to 80 attempts × 60s = 80 minutes
- Each polling cycle makes 2+ API calls to GitHub API
- Results in 40+ unnecessary API calls per evaluation run

### 2. **Race Conditions & Reliability Issues**
- Multiple concurrent runs can interfere with each other's polling logic
- Polling relies on timestamps which can be unreliable
- No guaranteed way to identify the correct workflow run when multiple exist

### 3. **No Direct Data Flow**
- Built image tags cannot be passed directly from benchmarks to evaluation
- Evaluation workflow has to infer which images to use
- No explicit dependency management

### 4. **Complex Error Handling**
- API-based dispatch makes error detection and handling difficult
- Timeouts are hard to debug
- No clear failure propagation between workflows

## 🏗️ Proposed Solution

Convert from "async dispatch + poll" to "sync workflow_call" pattern:

```
Current: software-agent-sdk →[dispatch]→ benchmarks →[poll]← software-agent-sdk →[dispatch]→ evaluation
Proposed: software-agent-sdk →[workflow_call]→ benchmarks →[outputs]→ software-agent-sdk →[workflow_call]→ evaluation
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve Cross-Repository Evaluation Workflow Orchestration #1249

Improve Cross-Repository Workflow Orchestration: Replace Dispatch/Poll with Native workflow_call

🎯 Goal

📋 Current Problems

1. Inefficient Polling (80+ minutes overhead)

2. Race Conditions & Reliability Issues

3. No Direct Data Flow

4. Complex Error Handling

🏗️ Proposed Solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve Cross-Repository Evaluation Workflow Orchestration #1249

Description

Improve Cross-Repository Workflow Orchestration: Replace Dispatch/Poll with Native workflow_call

🎯 Goal

📋 Current Problems

1. Inefficient Polling (80+ minutes overhead)

2. Race Conditions & Reliability Issues

3. No Direct Data Flow

4. Complex Error Handling

🏗️ Proposed Solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions