fix: scope OpenSpec sentinel per-change to prevent stale task queue by vishnujayvel · Pull Request #151 · asklokesh/loki-mode

vishnujayvel · 2026-04-06T21:15:45Z

Problem

When using OpenSpec with Loki Mode across multiple change proposals in the same repository, the OpenSpec task queue from a previous change persists into subsequent Loki runs -- even when the new run targets a completely different change or uses no --openspec flag at all.

This causes agents to silently work on wrong tasks, wasting tokens and creating incorrect PRs.

Root cause (3 parts)

Boolean sentinel: .loki/queue/.openspec-populated is a touch file -- it knows whether OpenSpec tasks were loaded, but not which change they came from. populate_openspec_queue() in run.sh (line 8744) checks only [[ -f ".loki/queue/.openspec-populated" ]].
No cleanup between runs: The CLI (autonomy/loki) overwrites openspec-tasks.json when --openspec is provided, but does NOT clear the sentinel or remove stale tasks from pending.json. When --openspec is absent entirely, nothing cleans up leftover OpenSpec state.
Task ID collisions: All OpenSpec changes produce IDs in openspec-N.M format (e.g., openspec-1.1, openspec-2.3). The deduplication check in populate_openspec_queue() uses these IDs, so when switching changes, new tasks with colliding IDs are silently blocked from loading.

Reproduction

# Run 1: Load 55 tasks from change A
loki start --openspec ./openspec/changes/window-management-redesign

# Run 2: Different task entirely, no --openspec
loki start /tmp/settings-menu-prd.md
# BUG: 55 stale window-management tasks still in pending.json, served to agents

Impact

Severity: High -- agents silently work on wrong tasks
Frequency: Every time a user runs Loki more than once in a repo that has used --openspec
Workaround: Manually delete .loki/queue/, .loki/openspec-tasks.json, and .loki/openspec/ between runs

Solution

Three-pronged fix addressing each root cause:

Fix 1: Scoped sentinel (`autonomy/run.sh`)

The sentinel file now stores the full change path instead of being an empty marker. populate_openspec_queue() reads the stored path and compares it against the current $OPENSPEC_CHANGE_PATH:

Match: Skip (same change, already populated)
Mismatch: Purge all source: "openspec" tasks from pending.json, then repopulate
Missing: Fresh population (first run)

# Before (boolean check)
if [[ -f ".loki/queue/.openspec-populated" ]]; then
    return 0  # always skips, regardless of which change
fi
touch ".loki/queue/.openspec-populated"

# After (scoped check)
stored_change="$(cat ".loki/queue/.openspec-populated" 2>/dev/null)"
if [[ "$stored_change" == "$OPENSPEC_CHANGE_PATH" ]]; then
    return 0  # only skips if same change
fi
# ... purge stale tasks, then repopulate ...
echo "$OPENSPEC_CHANGE_PATH" > ".loki/queue/.openspec-populated"

Fix 2: Stale state cleanup (`autonomy/loki`)

Two new cleanup paths in cmd_start():

When --openspec IS provided: Clear the sentinel before running the adapter, so run.sh will repopulate for the current change
When --openspec is NOT provided: Proactively remove all leftover OpenSpec artifacts (sentinel, tasks JSON, normalized PRD, delta context directory) and purge any stale OpenSpec tasks from pending.json

Fix 3: Change-scoped task IDs (`autonomy/openspec-adapter.py`)

parse_tasks() now accepts a change_name parameter and produces IDs in the format openspec-{change_name}-N.M instead of openspec-N.M. This prevents cross-change collisions at the deduplication layer.

# Before
task_id = f"openspec-{task_id_num}"  # openspec-1.1

# After
task_id = f"openspec-{change_name}-{task_id_num}"  # openspec-add-dark-mode-1.1

Files changed (5)

File	Lines	Change
`autonomy/run.sh`	+33/-5	Sentinel stores change path; compares on read; purges stale tasks on mismatch
`autonomy/loki`	+30/+0	Clears sentinel before adapter; full OpenSpec cleanup when `--openspec` absent
`autonomy/openspec-adapter.py`	+10/-3	`parse_tasks()` accepts `change_name`, scopes task IDs per-change
`tests/test_openspec_adapter.py`	+13/-5	Updated `test_task_ids_hierarchical` for new ID format
`skills/openspec-integration.md`	+8/-4	Updated documentation examples with new ID format

Backward compatibility

The purge strategy filters by source == "openspec" (a metadata field), not by ID prefix. This means:

Old-format tasks (openspec-N.M) are correctly purged
New-format tasks (openspec-{name}-N.M) are correctly purged
No ID format parsing is needed during purge

When change_name is empty (e.g., direct parse_tasks() call without context), the old openspec-N.M format is preserved as a fallback.

Test plan

bash -n autonomy/run.sh -- shell syntax valid
bash -n autonomy/loki -- shell syntax valid
python3 -c "import ast; ast.parse(...)" -- Python syntax valid
pytest tests/test_openspec_adapter.py -- 28/28 tests pass
test_task_ids_hierarchical updated and passes with new ID format
Manual: loki start --openspec ./changes/A then loki start --openspec ./changes/B -- verify queue purged and repopulated
Manual: loki start --openspec ./changes/A then loki start prd.md (no --openspec) -- verify all OpenSpec artifacts cleaned up

Discussion: OpenSpec adapter loads stale task queue across runs — .openspec-populated sentinel not scoped per-change #150
Same sentinel pattern exists for BMAD (.bmad-populated) and MiroFish (.mirofish-populated) -- may need similar fix in future
BUG-RUN-005 deduplication by task ID exacerbated this bug due to ID collisions

Generated with Claude Code

github-actions · 2026-04-06T21:15:57Z

All contributors have signed the CLA. Thank you.
_{Posted by the CLA Assistant Lite bot.}

vishnujayvel · 2026-04-06T21:22:30Z

I have read the CLA Document and I hereby sign the CLA

vishnujayvel · 2026-04-06T21:36:20Z

Decision Tree: Sentinel Logic Across All Scenarios

This diagram shows how the fix handles every combination of --openspec presence, sentinel state, and restart conditions. Each leaf node shows the outcome and why it's correct.

flowchart TD
    START["loki start invoked"] --> HAS_FLAG{"--openspec\nprovided?"}

    %% === No --openspec branch ===
    HAS_FLAG -->|No| CLEANUP["CLI cleanup block\n(autonomy/loki:941-965)"]
    CLEANUP --> DEL_SENTINEL["Delete sentinel\nDelete openspec-tasks.json\nDelete openspec/ dir"]
    DEL_SENTINEL --> PURGE_PENDING["Purge source:openspec\nfrom pending.json"]
    PURGE_PENDING --> RUN_SH_NO["run.sh starts\npopulate_openspec_queue()"]
    RUN_SH_NO --> CHECK_TASKS_FILE{"openspec-tasks.json\nexists?"}
    CHECK_TASKS_FILE -->|"No (deleted)"| SKIP_CLEAN["Return early\n--- No stale tasks served ---"]

    %% === --openspec provided branch ===
    HAS_FLAG -->|Yes| EXPORT["Export OPENSPEC_CHANGE_PATH\nRun adapter (regenerates artifacts)"]
    EXPORT --> RUN_SH_YES["run.sh starts\npopulate_openspec_queue()"]
    RUN_SH_YES --> CHECK_TASKS_YES{"openspec-tasks.json\nexists?"}
    CHECK_TASKS_YES -->|No| SKIP_NO_TASKS["Return early\n(adapter error?)"]
    CHECK_TASKS_YES -->|Yes| CHECK_SENTINEL{"Sentinel file\nexists?"}

    %% --- No sentinel (first ever run) ---
    CHECK_SENTINEL -->|No| POPULATE["Populate pending.json\nfrom openspec-tasks.json"]
    POPULATE --> WRITE_SENTINEL["Write change path\nto sentinel"]
    WRITE_SENTINEL --> DONE_FRESH["--- Fresh population done ---"]

    %% --- Sentinel exists ---
    CHECK_SENTINEL -->|Yes| READ_SENTINEL["Read stored path\nfrom sentinel"]
    READ_SENTINEL --> COMPARE{"Stored path ==\ncurrent path?"}

    %% Same change (crash-restart)
    COMPARE -->|"Match\n(same change)"| SKIP_RESUME["Skip repopulation\n--- Progress preserved ---\nCompleted tasks stay gone\nfrom pending.json"]

    %% Different change
    COMPARE -->|"Mismatch\n(change switched)"| PURGE_OLD["Purge all source:openspec\ntasks from pending.json"]
    PURGE_OLD --> REPOPULATE["Repopulate from new\nopenspec-tasks.json"]
    REPOPULATE --> UPDATE_SENTINEL["Overwrite sentinel\nwith new path"]
    UPDATE_SENTINEL --> DONE_SWITCH["--- Clean switch done ---\nOld tasks gone, new loaded"]

    %% Styling
    style SKIP_CLEAN fill:#2d6a2d,color:#fff
    style DONE_FRESH fill:#2d6a2d,color:#fff
    style SKIP_RESUME fill:#2d6a2d,color:#fff
    style DONE_SWITCH fill:#2d6a2d,color:#fff
    style SKIP_NO_TASKS fill:#8b6914,color:#fff

Scenario Coverage Matrix

#	Scenario	Sentinel before	Sentinel after	pending.json effect	Correct?
1	First run with `--openspec A`	Missing	`path/to/A`	55 tasks loaded	Yes
2	Crash-restart, same `--openspec A`	`path/to/A`	`path/to/A` (unchanged)	Untouched -- 35 remaining tasks preserved, 20 completed stay gone	Yes
3	Switch to `--openspec B`	`path/to/A`	`path/to/B`	Old A tasks purged, new B tasks loaded	Yes
4	No `--openspec` after previous run	`path/to/A`	Deleted	All openspec tasks purged from queue	Yes
5	No `--openspec`, never used openspec	Missing	Missing	No-op (openspec-tasks.json doesn't exist)	Yes
6	Direct `run.sh` invocation (bypass CLI)	Any	Handled by run.sh	Scoped comparison still works standalone	Yes

Why the sentinel is NOT deleted when `--openspec` is provided

The sentinel doubles as a progress checkpoint. On crash-restart (scenario 2):

Run 1: 55 tasks loaded → 20 completed (removed from pending.json) → crash
Run 2: Same --openspec flag
  - Sentinel matches → skip repopulation
  - pending.json still has 35 remaining tasks
  - The 20 completed tasks are NOT re-added

If we deleted the sentinel in the CLI (our earlier approach), run.sh would repopulate from openspec-tasks.json (which still lists all 55 as pending). The 20 completed tasks would be re-queued because their IDs are no longer in pending.json's dedup set. Agents would redo completed work.

Task ID collision prevention

Old format: openspec-1.1 (collides across changes)
New format: openspec-{change-name}-1.1 (scoped per change)

Even if the sentinel logic were bypassed somehow, ID-level scoping prevents cross-change task confusion at the dedup layer.

asklokesh · 2026-04-08T13:31:25Z

Hey @vishnujayvel -- great contribution! The root cause analysis is thorough, the three-pronged fix is well-structured, and the decision tree comment you posted is genuinely excellent documentation. The crash-restart fix in the second commit shows good attention to edge cases. Thank you for taking this on.

That said, I found a few issues during review that I'd like addressed before merging. Tagging you to take a look.

Requested Changes

1. Purge only targets pending.json, misses completed/in-progress (bug)

The Python purge blocks only filter pending.json. If OpenSpec tasks were already moved to completed.json or in-progress.json during a previous run, those ghost entries survive the change switch. Status reports and progress tracking would show phantom completed tasks from a completely different change.

The purge needs to clean all three queues (pending, completed, in-progress), not just pending.

2. No way to reload tasks after editing tasks.md for the same change (design gap)

The CLI always re-runs the adapter (regenerating openspec-tasks.json), then run.sh checks the sentinel, sees the same path, and skips loading. If a user edits tasks.md within the same change directory and re-runs loki start --openspec ./changes/A, the adapter produces updated output but it never gets loaded into pending.json. Edited tasks are silently ignored.

Your scenario matrix (scenario 2) accounts for crash-restart, but not intentional re-runs after editing. Consider either:

Adding a --force-reload flag that deletes the sentinel before run.sh starts
Storing a hash of openspec-tasks.json alongside the path in the sentinel, so content changes are detected

3. `2>/dev/null` silently swallows purge failures (serious)

Both purge blocks redirect stderr to /dev/null. If python3 isn't on PATH, or pending.json contains malformed JSON, the purge silently fails. Stale tasks remain in the queue and agents work on wrong tasks -- the exact bug this PR is fixing. The safety net has a hole in it.

At minimum, let errors through to stderr so failures are visible:

# Instead of:
..." 2>/dev/null

# Use:
..." || log_warn "OpenSpec queue cleanup encountered an error"

4. Duplicated purge logic will diverge

The Python purge snippet appears in both autonomy/loki (lines 949-963) and autonomy/run.sh (lines 8758-8770). They're nearly identical but already differ in path construction (one uses os.path.join with an env var, the other uses a hardcoded relative path). Consider extracting to a shared script like autonomy/openspec-purge-queue.py that both locations call.

Suggestions (non-blocking)

rm -rf "$LOKI_DIR/openspec/" when --openspec is absent nukes all adapter output (normalized PRD, design docs, scenarios). If a user runs a quick non-openspec task then comes back to the same change, everything must be regenerated. The sentinel deletion already prevents stale task loading; consider keeping the adapter output as a cache.
The file=__import__('sys').stderr pattern in inline Python is clever but hard to read. A plain import sys; print(..., file=sys.stderr) is clearer.
The PR mentions BMAD and MiroFish have the same sentinel pattern. Worth filing follow-up issues so they don't get forgotten.

Overall this is solid work on a high-severity bug. Looking forward to the next revision. Thanks again for the contribution!

…e-wide purge Problem: The OpenSpec sentinel (.openspec-populated) was a boolean touch file that tracked whether tasks were loaded, but not which change or content. Switching between OpenSpec changes or editing tasks.md left stale tasks in the queue. Agents would silently execute the wrong plan. Fix (three layers of defense): 1. Scoped sentinel with content hash -- sentinel stores change path (line 1) and md5 hash of openspec-tasks.json (line 2). Detects both change switches and tasks.md edits. Backward compatible with old single-line format (triggers safe reload on upgrade). 2. Queue-wide jq purge -- new purge_openspec_from_queue() function cleans all three queue files (pending, completed, in-progress) using jq. Replaces inline Python, removes python3 dependency for purge path. Errors surface via log_warn instead of being swallowed. 3. Scoped task IDs -- task IDs now include the change name (openspec-{change}-N.M) to prevent dedup collisions across changes. Design decision: when --openspec is not passed, existing OpenSpec state is left untouched. Purge only triggers when --openspec IS passed with a different change path or different content hash. Tests: 31 shell integration tests (test-openspec-sentinel.sh) covering all state transitions + 28 existing adapter unit tests passing. Closes: asklokesh#150 Follow-ups filed: asklokesh#154 (BMAD sentinel), asklokesh#155 (MiroFish sentinel) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

vishnujayvel · 2026-04-08T23:18:52Z

Review Response (v2)

Thanks for the thorough review @asklokesh! All 4 requested changes addressed, rebased on v6.75.3, squashed to a single commit.

1. Purge now cleans all 3 queue files

New purge_openspec_from_queue() function (jq-based, no Python dependency) is called on pending.json, completed.json, and in-progress.json. No more ghost tasks in status reports or progress tracking.

2. Content hash detects tasks.md edits

Sentinel now stores two lines: change path (line 1) and md5 hash of openspec-tasks.json (line 2). Editing tasks.md and re-running with the same --openspec path triggers automatic purge + reload.

Cross-platform: md5sum (Linux) / md5 -q (macOS) / "none" fallback
Backward compatible: old single-line sentinels trigger a safe reload (empty hash != any real hash)

3. No more silent error swallowing

Replaced all inline Python purge blocks with the jq-based shell function. jq stderr goes to a separate .err file (not mixed with JSON output). Errors surface via log_warn. No 2>/dev/null on purge paths.

4. Single purge function, no duplication

purge_openspec_from_queue() defined once in run.sh. The CLI cleanup block in autonomy/loki was removed entirely.

Design decision: When --openspec is not passed, existing state is left untouched. populate_openspec_queue() has an early-return guard for empty OPENSPEC_CHANGE_PATH. Not passing a flag does not imply "undo previous work."

Non-blocking suggestions addressed

Suggestion	Response
Keep adapter output as cache	Addressed -- no-openspec runs leave all state untouched
`__import__('sys')` readability	Moot -- inline Python removed, replaced with jq
File BMAD/MiroFish issues	Done: #154 (BMAD), #155 (MiroFish)

Additional improvements (beyond requested)

Documentation: Updated skills/openspec-integration.md with Queue State Management section (sentinel format, state transitions table, purge behavior) and corrected complexity threshold (21+/11+ matching > 20/> 10 in code)
Shell integration tests: Added tests/test-openspec-sentinel.sh (36 tests, 13 scenarios) following the existing test-task-queue.sh pattern

Test coverage

28 unit tests (tests/test_openspec_adapter.py) + 36 integration tests (tests/test-openspec-sentinel.sh) = 64 total, all passing

Scenarios covered:

#	Scenario	Tests
1	Fresh run (no sentinel)	3
2	Crash-restart (path + hash match)	2
3	Change switch (purge all 3 queues)	9
4	Content edit (hash mismatch)	1
5	No --openspec (don't touch)	3
6	Direct run.sh (bypass CLI)	1
7	Legacy sentinel (backward compat)	2
8	Malformed JSON error handling	2
9	Empty queue files	3
10	Nonexistent queue file	3
11	Task ID scoping	1
12	Mixed-source preservation	3
13	Empty OPENSPEC_CHANGE_PATH	5

vishnujayvel requested a review from asklokesh as a code owner April 6, 2026 21:15

vishnujayvel force-pushed the fix/openspec-stale-queue branch from b28c63f to 53ff1d7 Compare April 6, 2026 21:22

vishnujayvel force-pushed the fix/openspec-stale-queue branch from 16669f1 to 5029fa6 Compare April 8, 2026 23:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: scope OpenSpec sentinel per-change to prevent stale task queue#151

fix: scope OpenSpec sentinel per-change to prevent stale task queue#151
vishnujayvel wants to merge 1 commit intoasklokesh:mainfrom
vishnujayvel:fix/openspec-stale-queue

vishnujayvel commented Apr 6, 2026

Uh oh!

github-actions bot commented Apr 6, 2026 •

edited

Loading

Uh oh!

vishnujayvel commented Apr 6, 2026

Uh oh!

vishnujayvel commented Apr 6, 2026

Uh oh!

asklokesh commented Apr 8, 2026

Uh oh!

vishnujayvel commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vishnujayvel commented Apr 6, 2026

Problem

Root cause (3 parts)

Reproduction

Impact

Solution

Fix 1: Scoped sentinel (autonomy/run.sh)

Fix 2: Stale state cleanup (autonomy/loki)

Fix 3: Change-scoped task IDs (autonomy/openspec-adapter.py)

Files changed (5)

Backward compatibility

Test plan

Related

Uh oh!

github-actions bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vishnujayvel commented Apr 6, 2026

Uh oh!

vishnujayvel commented Apr 6, 2026

Decision Tree: Sentinel Logic Across All Scenarios

Scenario Coverage Matrix

Why the sentinel is NOT deleted when --openspec is provided

Task ID collision prevention

Uh oh!

asklokesh commented Apr 8, 2026

Requested Changes

1. Purge only targets pending.json, misses completed/in-progress (bug)

2. No way to reload tasks after editing tasks.md for the same change (design gap)

3. 2>/dev/null silently swallows purge failures (serious)

4. Duplicated purge logic will diverge

Suggestions (non-blocking)

Uh oh!

vishnujayvel commented Apr 8, 2026

Review Response (v2)

1. Purge now cleans all 3 queue files

2. Content hash detects tasks.md edits

3. No more silent error swallowing

4. Single purge function, no duplication

Non-blocking suggestions addressed

Additional improvements (beyond requested)

Test coverage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix 1: Scoped sentinel (`autonomy/run.sh`)

Fix 2: Stale state cleanup (`autonomy/loki`)

Fix 3: Change-scoped task IDs (`autonomy/openspec-adapter.py`)

github-actions bot commented Apr 6, 2026 •

edited

Loading

Why the sentinel is NOT deleted when `--openspec` is provided

3. `2>/dev/null` silently swallows purge failures (serious)