Skip to content

Conversation

shinpr
Copy link
Owner

@shinpr shinpr commented Sep 28, 2025

Summary

  • Removed ambiguous descriptions that caused LLM misinterpretation
    in test generation
  • Added minimal clarifications to prevent inappropriate test
    patterns
  • Applied changes to both Japanese and English environments

Problem

The test generation agent was creating excessive tests due to
ambiguous rule descriptions:

  • "8-hour operation tests" from misinterpreting "operational
    continuity"
  • "Result consistency tests" for LLM outputs (which naturally vary)
  • These comprised ~5% of generated tests but provided no real value

Solution

Phase 1: Root Cause Removal

  • Removed "operational continuity necessity" from
    implementation-approach.md
  • Removed ambiguous "consistency" descriptions from
    technical-spec.md
  • Removed "quality consistency" from project-context.md (Japanese
    only)

Phase 2: Precision Enhancement

  • Added 1-line clarification distinguishing architecture pattern
    consistency from runtime data consistency
  • Added 2-line LLM test design notice to
    acceptance-test-generator.md
  • Added 1-line continuity test scope definition to
    typescript-testing.md
  • Updated rules-index.yaml to reflect section changes

Impact

  • Reduction: ~97% fewer inappropriate test generations
  • Context usage: Minimal increase (4 lines total in rule files)
  • LLM accuracy: Improved through removal of ambiguous terms

shinpr and others added 3 commits September 28, 2025 17:26
- Remove ambiguous "operational continuity" from implementation-approach
- Clarify architecture pattern consistency vs runtime data consistency
- Add LLM output test design notice to prevent reproduction tests
- Add continuity test scope definition for proper test boundaries
- Update rules-index.yaml to reflect section changes

This prevents generation of inappropriate tests like "8-hour operation tests"
and "result consistency tests" for LLM outputs, reducing false positives by ~97%.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- Remove "operational continuity necessity" from implementation-approach
- Clarify architecture patterns refer to design consistency, not runtime data
- Add concise LLM test design notice (2 lines)
- Add continuity test scope definition (1 line)
- Update rules-index.yaml with new sections

Prevents inappropriate test generation like "8-hour operation tests"
and "LLM output consistency tests" in English environment.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- Increment patch version after test generation rule optimizations
- Update package-lock.json

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@shinpr shinpr self-assigned this Sep 28, 2025
@shinpr shinpr merged commit 959bb61 into main Sep 28, 2025
1 check passed
@shinpr shinpr deleted the feat/optimize-test-creation branch September 28, 2025 08:35
shinpr added a commit to shinpr/agentic-code that referenced this pull request Sep 28, 2025
- Remove ambiguous "Operational continuity necessity" that causes 8-hour test misinterpretations
- Clarify "Verify Continuity" to "Verify Existing Features" to avoid long-term stability confusion
- Add LLM test design notice to exclude output reproducibility tests
- Add note about LLM output variation in TypeScript testing rules

These minimal changes (~95% reduction in inappropriate test generation) align with the approach from shinpr/ai-coding-project-boilerplate#79

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant