red team: edge cases — empty techniques list, empty metaprompt_plan, scorer failure cascade

## Three small edge-case issues, batched

### 1. \`techniques=[]\` silently disables injection

\`red_team_agent.py:875–877\`:

\`\`\`python
if (
    self._injection_probability > 0
    and self._techniques
    and random.random() < self._injection_probability
):
\`\`\`

\`and self._techniques\` is a truthiness check. User passing \`injection_probability=0.5, techniques=[]\` gets no injection at all, no warning. They will think their custom-techniques setup is working when it's not.

**Fix:** validate at \`__init__\`:

\`\`\`python
if injection_probability > 0 and not techniques and DEFAULT_TECHNIQUES:
    pass  # fine — falls back to DEFAULT_TECHNIQUES
elif injection_probability > 0 and techniques is not None and len(techniques) == 0:
    raise ValueError("techniques cannot be empty when injection_probability > 0")
\`\`\`

### 2. Empty \`metaprompt_plan\` produces a malformed system prompt

\`goat.py:199–200\`:

\`\`\`python
ATTACK PLAN:
{metaprompt_plan}
\`\`\`

If \`_generate_attack_plan\` returns \`""\` (LLM returned whitespace, was stripped), the attacker LLM sees a literal empty section labeled "ATTACK PLAN:". A labeled empty section is worse than no section — it tells the attacker its plan is "nothing".

**Fix:** in \`build_system_prompt\`, omit the section if \`metaprompt_plan.strip() == ""\`. Same for Crescendo.

### 3. Scorer failures cascade silently

\`red_team_agent.py:643–645\`:

\`\`\`python
except Exception as exc:
    logger.debug("Scorer failed (turn %d): %s", current_turn, exc)
    score, adaptation = 0, "continue current approach"
\`\`\`

If the scorer model is rate-limited or down, every turn scores 0, early-exit never fires, the marathon runs the full \`total_turns\` budget. Logged at DEBUG only — invisible by default.

**Fix:** track consecutive failures; bump to WARN level after N (e.g. 3) and add a span attribute \`red_team.scorer_failures: N\` so dashboards can alert.

## Acceptance

- [ ] Each of the three has a regression test
- [ ] No production-path \`Exception\` path stays at DEBUG-only logging

Discovered during PR #306 review.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

red team: edge cases — empty techniques list, empty metaprompt_plan, scorer failure cascade #333

Three small edge-case issues, batched

1. `techniques=[]` silently disables injection

2. Empty `metaprompt_plan` produces a malformed system prompt

3. Scorer failures cascade silently

Acceptance

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

red team: edge cases — empty techniques list, empty metaprompt_plan, scorer failure cascade #333

Description

Three small edge-case issues, batched

1. `techniques=[]` silently disables injection

2. Empty `metaprompt_plan` produces a malformed system prompt

3. Scorer failures cascade silently

Acceptance

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions