fix(rules): reconcile Units Planning ghost stage with canonical Units Generation by scottschreckengaust · Pull Request #156 · awslabs/aidlc-workflows

scottschreckengaust · 2026-03-30T19:56:43Z

Summary

Fixes the highest-priority finding from a comprehensive prompt engineering review of the aidlc-rules/ directory.

Problem: "Units Planning" was referenced as a separate top-level workflow stage in 5 files, but core-workflow.md only defines a single "Units Generation" stage (with planning as an internal Part 1 sub-step). This inconsistency causes state tracking confusion, session continuity breakage, and incorrect Mermaid diagrams.

Fix: Reconcile all references to use the canonical "Units Generation" stage name, clarifying that planning is an internal sub-step.

Files changed

File	Change
`workflow-planning.md`	Merged UP/UG Mermaid nodes into one; removed duplicate checklist items from both templates
`units-generation.md`	Fixed Step 11 to reference "Units Generation Part 1 (Planning)" instead of ghost stage
`terminology.md`	Updated stage list, definition, usage guidance, and annotated planning/generation examples
`workflow-changes.md`	Fixed impact assessments to reference Units Generation instead of ghost stage
`error-handling.md`	Fixed recovery guidance to reference Units Generation

Full Prompt Engineering Review

This PR addresses finding C1 (the highest priority) from a comprehensive review of all 30 files in aidlc-rules/. The complete list of 25 findings (3 Critical, 8 High, 9 Medium, 5 Low) with detailed descriptions, affected files, and recommended fixes is attached as a PR comment.

Priority	ID	Severity	Finding	Status
1	C1	Critical	"Units Planning" ghost stage — state tracking and session continuity breakage	This PR
2	C2	Critical	"Never ask in chat" conflicts with inline approval prompts	Backlog
3	C3	Critical	Terminology glossary uses nonexistent stage names ("Context Assessment")	Backlog
4	H1	High	Reverse Engineering artifact lists inconsistent across 3 files	Backlog
5	H2	High	3-option completion messages violate NO EMERGENT BEHAVIOR rule	Backlog
6	H3	High	overconfidence-prevention.md is a changelog, not an actionable rule	Backlog
7	H4	High	Most common rule files have no defined loading trigger	Backlog
8	H5	High	"Assume the role" promotes overconfidence	Backlog
9	H6	High	"No fixed sequences" claim is factually wrong	Backlog
10	H7	High	Extension enforcement default contradicts opt-in model	Backlog
11	H8	High	OWASP mapping uses fabricated 2025 edition	Backlog
12-25		Medium/Low	14 additional findings (see review comment)	Backlog

Test plan

Verify no remaining standalone "Units Planning" stage references in aidlc-rules/ (only the annotated terminology example should remain)
Confirm core-workflow.md stage names match all checklist templates in workflow-planning.md
Confirm Mermaid diagrams show a single Units Generation node

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

… Generation Units Planning was referenced as a separate workflow stage in 5 files (workflow-planning.md, units-generation.md, terminology.md, workflow-changes.md, error-handling.md) but was never defined as a top-level stage in core-workflow.md. It is actually Part 1 (an internal sub-step) of the Units Generation stage. This inconsistency caused: - State tracking confusion (aidlc-state.md would have two entries for one stage) - Session continuity breakage (stage names in state file wouldn't match what the rules expect) - Mermaid diagram showing 2 nodes for a single stage Changes: - Merge UP/UG Mermaid nodes into single Units Generation node - Remove duplicate Units Planning checklist items from execution plans - Clarify that planning/generation are internal sub-steps, not stages - Update all impact assessments and error recovery references

…dings

scottschreckengaust · 2026-03-30T20:21:12Z

Prompt Engineering Review — aidlc-rules/ (2026-03-30)

Comprehensive review of all 30 files in aidlc-rules/ by severity and priority.

Critical (3)

C1. "Units Planning" Ghost Stage ✅ Fixed in this PR

Files: workflow-planning.md, units-generation.md, terminology.md, workflow-changes.md, error-handling.md

"Units Planning" was referenced as a separate top-level workflow stage in 5 files, but core-workflow.md only defines a single "Units Generation" stage with planning as an internal Part 1 sub-step. This causes state tracking confusion in aidlc-state.md, session continuity breakage on resume, and incorrect Mermaid diagrams showing two nodes for one stage.

Fix: Reconcile all references to use the canonical "Units Generation" stage name, clarifying that planning is an internal sub-step.

C2. "Never Ask Questions in Chat" Conflicts with Inline Approval Prompts

Files: common/question-format-guide.md, core-workflow.md, inception/units-generation.md, all construction stage files, common/session-continuity.md

question-format-guide.md states: "CRITICAL: You must NEVER ask questions directly in the chat. ALL questions must be placed in dedicated question files." Yet core-workflow.md says Ask: "Build and test instructions complete. Ready to proceed to Operations stage?" — a direct chat question. Every stage completion message presents "Request Changes" / "Continue to Next Stage" choices directly in chat. session-continuity.md also presents inline choices while separately stating: "ALWAYS ask clarification or user feedback questions by placing them in .md files."

Fix: Scope the "never ask in chat" rule to requirements-gathering questions only. Restate as: "Requirements-gathering and clarification questions MUST be placed in dedicated .md files with [Answer]: tags. Stage completion prompts and approval requests MAY be presented inline in chat."

C3. Terminology Glossary Uses Nonexistent Stage Names

Files: common/terminology.md (line 13)

Usage examples reference "Context Assessment stage" and "Requirements Assessment stage." Neither exists in the workflow. Actual names are "Workspace Detection" and "Requirements Analysis." These appear to be leftover names from a previous version.

Fix: Replace "Context Assessment" with "Workspace Detection" and "Requirements Assessment" with "Requirements Analysis." Audit the entire terminology file for other stale references.

High (8)

H1. Reverse Engineering Artifact List Inconsistent Across 3 Files

Files: core-workflow.md (lines 126-133), inception/reverse-engineering.md, common/session-continuity.md (lines 28-29)

Three files list different artifact sets. core-workflow.md mentions "Interaction Diagrams" (not in reverse-engineering.md). reverse-engineering.md generates code-quality-assessment.md and reverse-engineering-timestamp.md (not in core-workflow.md). session-continuity.md only lists 3 of 9 artifacts for context loading on resume.

Fix: Establish a single canonical artifact list in reverse-engineering.md. Ensure both core-workflow.md and session-continuity.md reference it accurately. Either add "Interaction Diagrams" to reverse-engineering.md or remove from core-workflow.md. Add all missing artifacts to session-continuity.md's load list.

H2. 3-Option Completion Messages Violate NO EMERGENT BEHAVIOR Rule

Files: inception/application-design.md (lines 116-132), core-workflow.md (line 460), construction/build-and-test.md

core-workflow.md states: "NO EMERGENT BEHAVIOR: Construction phases MUST use standardized 2-option completion messages." Yet application-design.md uses a 3-option message (Request Changes, Add Units Generation, Approve & Continue). build-and-test.md also uses 3 options. The rule says "Construction phases" but Application Design is in Inception — the scope is unclear.

Fix: Either extend 2-option standardization to all phases (handle "add skipped stage" through workflow-changes.md instead), or explicitly scope the rule and document which stages may present more than 2 options.

H3. overconfidence-prevention.md Is a Changelog, Not an Actionable Rule

Files: common/overconfidence-prevention.md, inception/application-design.md (line 41)

This file is a human-readable rationale document ("The overconfidence issue was caused by..."). It describes what was wrong and changed, not what an agent should do. The actual behavior is already baked into stage files. Loading it wastes context. It is referenced once in application-design.md which may confuse agents.

Fix: Move outside rule-details/ to a docs/ or design-decisions/ folder, or convert into an actionable rule file with clear directives.

H4. Most Common Rule Files Have No Defined Loading Trigger

Files: core-workflow.md (lines 21-26), all 11 files in common/

core-workflow.md mandates loading 4 common files at start: process-overview.md, session-continuity.md, content-validation.md, question-format-guide.md. The remaining 7 (ascii-diagram-standards.md, depth-levels.md, error-handling.md, overconfidence-prevention.md, terminology.md, welcome-message.md, workflow-changes.md) have no loading trigger. error-handling.md is 375 lines of critical recovery procedures that may never be loaded.

Fix: For each common file, define when it should be loaded — either add to the mandatory list, specify conditional triggers, or create a lightweight index.

H5. "Assume the Role of a Product Owner" Promotes Overconfidence

Files: inception/requirements-analysis.md (line 3), inception/user-stories.md (line 119)

Two files tell the agent to "assume the role of a product owner." This conflicts with overconfidence prevention — a product owner makes business decisions, but the agent is instructed elsewhere to defer to the user. Role assumption causes the agent to prioritize features and define acceptance criteria instead of asking.

Fix: Replace role-assumption directives with behavioral instructions: "Apply product ownership analysis techniques to evaluate completeness of requirements and quality of user stories. Defer all business decisions to the user."

H6. "No Fixed Sequences" Claim Is Factually Wrong

Files: common/process-overview.md (line 22), core-workflow.md

process-overview.md states: "No fixed sequences: Stages execute in the order that makes sense for your specific task." But core-workflow.md defines a completely fixed sequence. Conditional stages can be skipped but cannot be reordered.

Fix: Replace with: "Stages execute in a defined sequence, but conditional stages can be skipped when they do not add value."

H7. Extension Enforcement Default Contradicts Opt-In Model

Files: core-workflow.md (line 49)

Line 49 states: "Default to enforced if no configuration exists." But extensions are opt-in: *.opt-in.md files are loaded, users are asked during Requirements Analysis, answers are recorded. If an extension has an opt-in prompt the user hasn't answered yet, "default to enforced" would enforce rules the user never agreed to.

Fix: Change to: "If an extension has an opt-in file and the user has not yet been asked, defer enforcement until the opt-in question is presented and answered. Extensions without opt-in files are always enforced."

H8. OWASP Reference Mapping Uses Fabricated "2025" Edition

Files: extensions/security/baseline/security-baseline.md (lines 295-308)

The appendix references "OWASP Top 10 (2025)" which does not exist. The file contains a TODO comment acknowledging this. Category IDs (A01:2025, etc.) are fabricated. Only 8 of 15 SECURITY rules are mapped.

Fix: Update to OWASP Top 10 (2021) with correct category IDs, or remove the mapping table until verified. Map all 15 rules.

Medium (9)

M1. depth-levels.md Is Redundant with Stage-Level Guidance

Files: common/depth-levels.md, inception/requirements-analysis.md, inception/workflow-planning.md

Provides no concrete thresholds or decision criteria. Says "Model decides." Depth guidance is already embedded in stage files. Not in the mandatory loading list.

Fix: Integrate into workflow-planning.md or add concrete decision thresholds. Currently redundant.

M2. No Open-Ended Question Format Allowed

Files: common/question-format-guide.md

Mandates ALL questions use multiple-choice A/B/C/D format. Many requirements questions are inherently open-ended ("Describe the primary business process"). Forcing these into multiple-choice produces artificial options.

Fix: Add an open-response format option with [Answer]: tag. Document when each format is appropriate.

M3. Heavy Emoji Usage with No Configuration Option

Files: common/welcome-message.md, all completion message templates

Extensive emoji usage throughout interactive messages. No way to disable for enterprise environments. Inconsistent application.

Fix: Consider a tone/formatting config, or ensure consistent application. Alternatively, limit emoji to chat messages only, not generated artifacts.

M4. Build-and-Test Generates Documentation But Template Implies Execution

Files: construction/build-and-test.md

Generates instruction documents but never runs build/test commands. Summary template has fields for "Build Status: [Success/Failed]" and "Total Tests: [X], Passed: [X]" that imply actual execution.

Fix: Rename to "Build and Test Planning," add execution instructions, or clarify template values are "expected" not "actual."

M5. "Interaction Diagrams" Requirement Has No Template or Guidance

Files: core-workflow.md (line 131), inception/reverse-engineering.md

core-workflow.md lists "Generate Interaction Diagrams" as a Reverse Engineering output. reverse-engineering.md has no corresponding step, template, or target file.

Fix: Add a step in reverse-engineering.md for generating Interaction Diagrams, or remove the requirement from core-workflow.md.

M6. Session Continuity Misses Application Design Artifacts

Files: common/session-continuity.md (line 31), inception/application-design.md

session-continuity.md loads components.md, component-methods.md, services.md for Application Design. Missing: component-dependency.md and the consolidated application-design.md (which contains everything).

Fix: Add all Application Design artifacts, or reference the consolidated application-design.md.

M7. Error Handling References Nonexistent Operations Stage

Files: common/error-handling.md (lines 163-172)

Has an "Operations Errors" section covering build tool detection and deployment errors. Operations is a placeholder stage with no defined steps. These errors belong in Build and Test.

Fix: Rename to "Build and Test Errors."

M8. Functional Design Prerequisite Chain Unclear

Files: construction/functional-design.md (lines 16-19)

States Application Design is "recommended" but Units Generation is "required." However, Step 1 reads artifacts from Application Design. Can Functional Design run without Application Design?

Fix: Make Application Design a hard prerequisite if its artifacts are needed, or document what to do when they're absent.

M9. Content Validation Has Broken Markdown in Its Own Mermaid Example

Files: common/content-validation.md (lines 42-55)

Nested triple-backtick code blocks are not properly escaped, causing the outer code block to close prematurely. This is exactly the kind of error the file is supposed to prevent.

Fix: Use four backticks for the outer block or HTML entities for inner backticks.

Low (5)

L1. Answer-Analysis Guidance Duplicated in 7+ Files

Files: common/overconfidence-prevention.md, common/question-format-guide.md, inception/application-design.md, inception/user-stories.md, inception/units-generation.md, construction/functional-design.md, construction/nfr-requirements.md

Same examples ("You mentioned 'mix of A and B'") repeated word-for-word across 7+ files.

Fix: Consolidate into a single common file and reference it.

L2. ASCII Diagram Examples May Not Comply with Their Own Width Rule

Files: common/ascii-diagram-standards.md (lines 35-44, 47-59)

"Every line in a box MUST have EXACTLY the same character count." Examples may have inconsistent widths.

Fix: Verify and correct examples.

L3. Mermaid Template Splits Units into Two Nodes ✅ Fixed as part of C1

Files: inception/workflow-planning.md (lines 264-265)

L4. "Other" Option Letter Inconsistent (X vs Sequential)

Files: common/question-format-guide.md, extension opt-in files

Template says use "X)" but examples use "E)", "D)", "C)".

Fix: Standardize on "X)" for the "Other" option everywhere.

L5. Acknowledged Duplication Between process-overview.md and welcome-message.md Has Diverged

Files: common/process-overview.md, common/welcome-message.md

process-overview.md says duplication with welcome-message.md is "INTENTIONAL" but the content has diverged (different diagrams, different claims about sequencing).

Fix: Ensure factual consistency even if format differs.

github-actions · 2026-04-07T05:39:21Z

A. Executive Summary

Latest release: PR #156

High-level snapshot comparing the latest release against the golden baseline (the reference evaluation used as the quality target).

Metric	What it measures
Unit tests passed	Number of generated unit tests that pass. Higher means the rules produce broader, more complete test suites.
Contract tests	API compliance checks against the OpenAPI spec (passed/total). 88/88 = full compliance.
Lint findings	Static analysis warnings in generated code. Lower is better — 0 means clean code.
Qualitative score	AI-graded quality of generated documentation on a 0–1 scale (higher is better).
Execution time	Wall-clock time for the full evaluation run. Lower means faster generation.
Total tokens	Total LLM tokens consumed (input + output). Lower means more cost-efficient.

Metric	Golden	Latest (PR #156)	vs Golden	Trend
Unit tests passed	180	132	-48	`█▄▄▁▂▃▃▁▁` ↓
Contract tests	88/88	76/88	-12	`███▆████▁` ↓
Lint findings	0	0	—	`▅▅▅▅▅▅▅▅▅` →
Qualitative score	0.854	0.773	-0.081	`▅▇▇▆▇█▁▁▁` ↓
Execution time	23.8m	17.6m	-6.3m	`▁▅▁▅▂▄█▆▃` ↑
Total tokens	18.39M	4.67M	-13.73M	`▄▇▃▆▆█▄▂▁` ↓

Full trend report available in the workflow artifacts.

scottschreckengaust added 3 commits March 30, 2026 19:55

docs: add full prompt engineering review TODO with 25 prioritized fin…

5be590e

…dings

docs: move review TODO from repo to PR comment

f0f1328

Merge branch 'main' into prompt-eng/review-aidlc-rules

e79bb1a

scottschreckengaust added the codebuild A label to signal a request for the "CodeBuild" workflow label Mar 30, 2026

scottschreckengaust had a problem deploying to codebuild March 30, 2026 21:15 — with GitHub Actions Error

scottschreckengaust added codebuild A label to signal a request for the "CodeBuild" workflow and removed codebuild A label to signal a request for the "CodeBuild" workflow labels Mar 30, 2026

scottschreckengaust had a problem deploying to codebuild March 30, 2026 21:35 — with GitHub Actions Error

awslabs deleted a comment from github-actions bot Mar 31, 2026

scottschreckengaust added rules and removed codebuild A label to signal a request for the "CodeBuild" workflow labels Mar 31, 2026

scottschreckengaust had a problem deploying to codebuild March 31, 2026 23:23 — with GitHub Actions Error

Merge branch 'main' into prompt-eng/review-aidlc-rules

8850afa

scottschreckengaust temporarily deployed to codebuild March 31, 2026 23:24 — with GitHub Actions Inactive

t1h0 mentioned this pull request Apr 2, 2026

fix(rules): Unmix signals about question format (file/inline) #167

Closed

5 tasks

Merge branch 'main' into prompt-eng/review-aidlc-rules

18e1766

scottschreckengaust marked this pull request as ready for review April 7, 2026 05:15

scottschreckengaust requested review from a team, harmjeff, raj-jain-aws, scoropeza and spraja08 April 7, 2026 05:15

scottschreckengaust requested review from a team as code owners April 7, 2026 05:15

scottschreckengaust temporarily deployed to codebuild April 7, 2026 05:16 — with GitHub Actions Inactive

Merge branch 'main' into prompt-eng/review-aidlc-rules

a44acaf

scottschreckengaust temporarily deployed to codebuild April 8, 2026 19:35 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(rules): reconcile Units Planning ghost stage with canonical Units Generation#156

fix(rules): reconcile Units Planning ghost stage with canonical Units Generation#156
scottschreckengaust wants to merge 7 commits intomainfrom
prompt-eng/review-aidlc-rules

scottschreckengaust commented Mar 30, 2026 •

edited

Loading

Uh oh!

scottschreckengaust commented Mar 30, 2026

Uh oh!

github-actions bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

scottschreckengaust commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files changed

Full Prompt Engineering Review

Test plan

Uh oh!

scottschreckengaust commented Mar 30, 2026

Prompt Engineering Review — aidlc-rules/ (2026-03-30)

Critical (3)

C1. "Units Planning" Ghost Stage ✅ Fixed in this PR

C2. "Never Ask Questions in Chat" Conflicts with Inline Approval Prompts

C3. Terminology Glossary Uses Nonexistent Stage Names

High (8)

H1. Reverse Engineering Artifact List Inconsistent Across 3 Files

H2. 3-Option Completion Messages Violate NO EMERGENT BEHAVIOR Rule

H3. overconfidence-prevention.md Is a Changelog, Not an Actionable Rule

H4. Most Common Rule Files Have No Defined Loading Trigger

H5. "Assume the Role of a Product Owner" Promotes Overconfidence

H6. "No Fixed Sequences" Claim Is Factually Wrong

H7. Extension Enforcement Default Contradicts Opt-In Model

H8. OWASP Reference Mapping Uses Fabricated "2025" Edition

Medium (9)

M1. depth-levels.md Is Redundant with Stage-Level Guidance

M2. No Open-Ended Question Format Allowed

M3. Heavy Emoji Usage with No Configuration Option

M4. Build-and-Test Generates Documentation But Template Implies Execution

M5. "Interaction Diagrams" Requirement Has No Template or Guidance

M6. Session Continuity Misses Application Design Artifacts

M7. Error Handling References Nonexistent Operations Stage

M8. Functional Design Prerequisite Chain Unclear

M9. Content Validation Has Broken Markdown in Its Own Mermaid Example

Low (5)

L1. Answer-Analysis Guidance Duplicated in 7+ Files

L2. ASCII Diagram Examples May Not Comply with Their Own Width Rule

L3. Mermaid Template Splits Units into Two Nodes ✅ Fixed as part of C1

L4. "Other" Option Letter Inconsistent (X vs Sequential)

L5. Acknowledged Duplication Between process-overview.md and welcome-message.md Has Diverged

Uh oh!

github-actions bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

A. Executive Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

scottschreckengaust commented Mar 30, 2026 •

edited

Loading

github-actions bot commented Apr 7, 2026 •

edited

Loading