Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 51 additions & 27 deletions commands/learn-eval.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
---
description: Extract reusable patterns from the session, self-evaluate quality before saving, and determine the right save location (Global vs Project).
description: "Extract reusable patterns from the session, self-evaluate quality before saving, and determine the right save location (Global vs Project)."
---

# /learn-eval - Extract, Evaluate, then Save

Extends `/learn` with a quality gate and save-location decision before writing any skill file.
Extends `/learn` with a quality gate, save-location decision, and knowledge-placement awareness before writing any skill file.

## What to Extract

Expand Down Expand Up @@ -51,41 +51,65 @@ origin: auto-extracted
[Trigger conditions]
```

5. **Self-evaluate before saving** using this rubric:
5. **Quality gate — Checklist + Holistic verdict**

| Dimension | 1 | 3 | 5 |
|-----------|---|---|---|
| Specificity | Abstract principles only, no code examples | Representative code example present | Rich examples covering all usage patterns |
| Actionability | Unclear what to do | Main steps are understandable | Immediately actionable, edge cases covered |
| Scope Fit | Too broad or too narrow | Mostly appropriate, some boundary ambiguity | Name, trigger, and content perfectly aligned |
| Non-redundancy | Nearly identical to another skill | Some overlap but unique perspective exists | Completely unique value |
| Coverage | Covers only a fraction of the target task | Main cases covered, common variants missing | Main cases, edge cases, and pitfalls covered |
### 5a. Required checklist (verify by actually reading files)

- Score each dimension 1–5
- If any dimension scores 1–2, improve the draft and re-score until all dimensions are ≥ 3
- Show the user the scores table and the final draft
Execute **all** of the following before evaluating the draft:

6. Ask user to confirm:
- Show: proposed save path + scores table + final draft
- Wait for explicit confirmation before writing
- [ ] Grep `~/.claude/skills/` by keyword to check for content overlap
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Checklist grep only checks global skills (~/.claude/skills/), missing project-level skills (.claude/skills/). Since step 3 defines both as valid save locations, duplicates in project skills would go undetected. The MEMORY.md item already says "both project and global" — apply the same pattern here.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At commands/learn-eval.md, line 60:

<comment>Checklist grep only checks global skills (`~/.claude/skills/`), missing project-level skills (`.claude/skills/`). Since step 3 defines both as valid save locations, duplicates in project skills would go undetected. The MEMORY.md item already says "both project and global" — apply the same pattern here.</comment>

<file context>
@@ -51,41 +51,65 @@ origin: auto-extracted
-6. Ask user to confirm:
-   - Show: proposed save path + scores table + final draft
-   - Wait for explicit confirmation before writing
+   - [ ] Grep `~/.claude/skills/` by keyword to check for content overlap
+   - [ ] Check MEMORY.md (both project and global) for overlap
+   - [ ] Consider whether appending to an existing skill would suffice
</file context>
Suggested change
- [ ] Grep `~/.claude/skills/` by keyword to check for content overlap
- [ ] Grep `~/.claude/skills/` and `.claude/skills/` by keyword to check for content overlap
Fix with Cubic

- [ ] Check MEMORY.md (both project and global) for overlap
- [ ] Consider whether appending to an existing skill would suffice
- [ ] Confirm this is a reusable pattern, not a one-off fix

7. Save to the determined location
### 5b. Holistic verdict

## Output Format for Step 5 (scores table)
Synthesize the checklist results and draft quality, then choose **one** of the following:

| Dimension | Score | Rationale |
|-----------|-------|-----------|
| Specificity | N/5 | ... |
| Actionability | N/5 | ... |
| Scope Fit | N/5 | ... |
| Non-redundancy | N/5 | ... |
| Coverage | N/5 | ... |
| **Total** | **N/25** | |
| Verdict | Meaning | Next Action |
|---------|---------|-------------|
| **Save** | Unique, specific, well-scoped | Proceed to Step 6 |
| **Improve then Save** | Valuable but needs refinement | List improvements → revise → re-evaluate (once) |
| **Absorb into [X]** | Should be appended to an existing skill | Show target skill and additions → Step 6 |
| **Drop** | Trivial, redundant, or too abstract | Explain reasoning and stop |

**Guideline dimensions** (informing the verdict, not scored):

- **Specificity & Actionability**: Contains code examples or commands that are immediately usable
- **Scope Fit**: Name, trigger conditions, and content are aligned and focused on a single pattern
- **Uniqueness**: Provides value not covered by existing skills (informed by checklist results)
- **Reusability**: Realistic trigger scenarios exist in future sessions

6. **Verdict-specific confirmation flow**

- **Save**: Present save path + checklist results + 1-line verdict rationale + full draft → save after user confirmation
- **Absorb into [X]**: Present target path + additions (diff format) + checklist results + verdict rationale → append after user confirmation
- **Drop**: Show checklist results + reasoning only (no confirmation needed)

7. Save / Absorb to the determined location
Comment on lines +83 to +89
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add the Improve then Save branch to Step 6.

Step 5b introduces four verdicts, but Step 6 only specifies confirmation behavior for three. Without an explicit post-revision path, the command can end up ambiguous after the one allowed re-evaluation.

Suggested wording
 6. **Verdict-specific confirmation flow**
 
+   - **Improve then Save**: Present the required improvements + revised draft + updated checklist/verdict after one re-evaluation; if the revised verdict is **Save**, save after user confirmation, otherwise follow the new verdict
    - **Save**: Present save path + checklist results + 1-line verdict rationale + full draft → save after user confirmation
    - **Absorb into [X]**: Present target path + additions (diff format) + checklist results + verdict rationale → append after user confirmation
    - **Drop**: Show checklist results + reasoning only (no confirmation needed)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@commands/learn-eval.md` around lines 83 - 89, Step 6's "Verdict-specific
confirmation flow" lists behaviors for only three verdicts while Step 5b
introduces four; add an explicit "Improve then Save" branch under that list that
describes running the one allowed re-evaluation/improvement, presenting the
revised draft along with save path, checklist results, and a 1-line verdict
rationale, and then saving only after user confirmation so the post-revision
path is unambiguous (refer to the "Verdict-specific confirmation flow" section
and the four verdicts introduced in Step 5b to place the new branch).


## Output Format for Step 5

```
### Checklist
- [x] skills/ grep: no overlap (or: overlap found → details)
- [x] MEMORY.md: no overlap (or: overlap found → details)
- [x] Existing skill append: new file appropriate (or: should append to [X])
- [x] Reusability: confirmed (or: one-off → Drop)

### Verdict: Save / Improve then Save / Absorb into [X] / Drop

**Rationale:** (1-2 sentences explaining the verdict)
```

## Design Rationale

This version replaces the previous 5-dimension numeric scoring rubric (Specificity, Actionability, Scope Fit, Non-redundancy, Coverage scored 1-5) with a checklist-based holistic verdict system. Modern frontier models (Opus 4.6+) have strong contextual judgment — forcing rich qualitative signals into numeric scores loses nuance and can produce misleading totals. The holistic approach lets the model weigh all factors naturally, producing more accurate save/drop decisions while the explicit checklist ensures no critical check is skipped.

## Notes

- Don't extract trivial fixes (typos, simple syntax errors)
- Don't extract one-time issues (specific API outages, etc.)
- Focus on patterns that will save time in future sessions
- Keep skills focused — one pattern per skill
- If Coverage score is low, add related variants before saving
- When the verdict is Absorb, append to the existing skill rather than creating a new file
Loading