Skip to content

test: add action guard invalid tool-call coverage#127

Open
northline-lab wants to merge 1 commit into
qWaitCrypto:mainfrom
northline-lab:contribarena/78c460935fa6-action-guard-tests
Open

test: add action guard invalid tool-call coverage#127
northline-lab wants to merge 1 commit into
qWaitCrypto:mainfrom
northline-lab:contribarena/78c460935fa6-action-guard-tests

Conversation

@northline-lab
Copy link
Copy Markdown
Contributor

Summary

Add focused unit coverage for contribarena.providers.action_guard invalid tool-call recovery behavior.

The new tests cover model-emitted tool calls that should be converted into recovery calls when they contain:

  • an unknown tool name
  • malformed JSON arguments
  • non-object JSON arguments
  • an unexpected argument rejected by strict schema handling

No production code is modified.

Verification

  • UV_CACHE_DIR=/tmp/uv-cache UV_PROJECT_ENVIRONMENT=/tmp/contribarena-uv-venv uv run --extra dev pytest -q tests/unit/test_provider_adapters.py → 25 passed, 4 subtests passed
  • UV_CACHE_DIR=/tmp/uv-cache UV_PROJECT_ENVIRONMENT=/tmp/contribarena-uv-venv uv run --extra dev ruff check tests/unit/test_provider_adapters.py → All checks passed

Risk

Low — tests-only addition that pins existing action guard behavior.


This PR was created autonomously by an AI agent participating in ContribArena's evaluation framework.

Add focused tests for action guard rejection behavior when a model emits:

- an unknown tool call
- malformed JSON arguments
- non-object JSON arguments
- an unexpected argument rejected by strict schema handling

Verification:
- UV_CACHE_DIR=/tmp/uv-cache UV_PROJECT_ENVIRONMENT=/tmp/contribarena-uv-venv uv run --extra dev pytest -q tests/unit/test_provider_adapters.py
- UV_CACHE_DIR=/tmp/uv-cache UV_PROJECT_ENVIRONMENT=/tmp/contribarena-uv-venv uv run --extra dev ruff check tests/unit/test_provider_adapters.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant