Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Autofix Details
Bugbot Autofix prepared fixes for both issues found in the latest run.
- ✅ Fixed: Unguarded list access crashes on empty regex match
- Added a guard to return early when the regex match list is empty before indexing.
- ✅ Fixed: Indexer receives None for q_c in else branch
- When q_lora_rank is None, pass hidden_states as a fallback q_c to the indexer.
Or push these changes by commenting:
@cursor push 108af11ad1
Preview (108af11ad1)
diff --git a/src/prime_rl/inference/patches.py b/src/prime_rl/inference/patches.py
--- a/src/prime_rl/inference/patches.py
+++ b/src/prime_rl/inference/patches.py
@@ -28,7 +28,10 @@
if num_hidden_layers is None:
return False
- layer_idx = int(_LAYER_INDEX_RE.findall(prefix)[-1])
+ matches = _LAYER_INDEX_RE.findall(prefix)
+ if not matches:
+ return False
+ layer_idx = int(matches[-1])
if layer_idx >= num_hidden_layers:
return False
@@ -97,7 +100,7 @@
raise ValueError("IndexCache shared layers require cached top-k indices.")
topk_indices = prev_topk_indices
else:
- topk_indices = self.indexer(hidden_states, q_c, positions, self.indexer_rope_emb)
+ topk_indices = self.indexer(hidden_states, q_c if q_c is not None else hidden_states, positions, self.indexer_rope_emb)
if llama_4_scaling is not None:
q *= llama_4_scalingThis Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| Field( | ||
| description="Override the loaded Hugging Face config's `index_topk_freq` for trainer and inference without editing the model directory.", | ||
| ), | ||
| ] = None |
There was a problem hiding this comment.
Missing CHANGELOG entry for new config field
Low Severity
This PR adds a new index_topk_freq config field to both SharedModelConfig in configs/rl.py and BaseModelConfig in configs/shared.py, but CHANGELOG.md has no corresponding entry. Per project rules, any PR that modifies configuration structures (added, removed, renamed, moved, or default value changes) must update the changelog.
Additional Locations (1)
Triggered by project rule: BugBot Instructions



Add support for Index Cache inference and training
Can use with
Note
Medium Risk
Touches model config propagation and monkey-patches vLLM/DeepSeek attention internals to reuse sparse top-k indices across layers, which could impact correctness/perf if assumptions change upstream. Also changes GLM MoE DSA attention forward signatures/returns to thread cached indices through the model.
Overview
Adds configurable Index Cache support via a new
index_topk_freqmodel setting, allowing runtime overrides from config without editing HF model directories.Plumbs
index_topk_freqthrough shared RL config, trainer model loading (overridingAutoConfig), and inference startup (exportingPRIME_RL_INDEX_TOPK_FREQ). In inference, introduces a vLLM monkeypatch that conditionally skips top-k recomputation on non-frequency layers by caching and reusing indices across DeepSeek/MLA layers.Extends the custom
glm_moe_dsamodel to support the same cross-layer reuse: addsindex_topk_freqto its config, marks layers to skip top-k, and threadscached_topk_indicesthrough attention/layer/model forwards while returning the computed indices for reuse.Written by Cursor Bugbot for commit 3191c5b. This will update automatically on new commits. Configure here.