rocm/atom-dev:latest crashes on MiniMax-M3 MXFP4: AIter API mismatch in fused_qknorm_idxrqknorm

## Summary

`rocm/atom-dev:latest` (built 2026-06-24 15:36 UTC) fails to start when serving `amd/MiniMax-M3-MXFP4`. The AIter library in the image was updated to split `kv_cache` into separate `kv_cache_k`/`kv_cache_v` arguments in `fused_qknorm_idxrqknorm`, but `atom/models/minimax_m3.py` (at ATOM HEAD `ab9eb781`) still uses the old single-tensor API. This causes a type error at warmup.

## Repro

```bash
docker pull rocm/atom-dev:latest

docker run --rm \
  --device=/dev/kfd --device=/dev/dri \
  --group-add video --ipc=host \
  --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \
  --network=host \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  --env HF_TOKEN=<token> \
  --env AITER_QUICK_REDUCE_QUANTIZATION=INT4 \
  --env ATOM_FORCE_ATTN_TRITON=1 \
  --env TORCHDYNAMO_DISABLE=1 \
  rocm/atom-dev:latest \
  python -m atom.entrypoints.openai_server \
    --model amd/MiniMax-M3-MXFP4 \
    --tensor-parallel-size 4 \
    --server-port 8000 \
    --trust-remote-code \
    --gpu-memory-utilization 0.8 \
    --block-size 128 \
    --max-model-len 32768 \
    --max-num-seqs 128 \
    --max-num-batched-tokens 32768 \
    --no-enable_prefix_caching \
    --enforce-eager
```

## Error

```
RuntimeError: aiter::_fused_qknorm_idxrqknorm_hip() Expected a value of type 'Optional[Tensor]' for argument 'index_cache' but instead found type 'int'.
Cast error details: Unable to cast 0 to Tensor
```

Full traceback:
```
File "/app/ATOM/atom/model_engine/model_runner.py", line 1158, in warmup_model
    output = self.compiled_callable(*args, **kwargs)
File "/app/ATOM/atom/models/minimax_m3.py", line 418, in forward
...
RuntimeError: aiter::_fused_qknorm_idxrqknorm_hip() Expected a value of type 'Optional[Tensor]' for argument 'index_cache' but instead found type 'int'.
```

## Root Cause

**`atom/models/minimax_m3.py`** (around line 717) calls `aiter.fused_qknorm_idxrqknorm` with the **old** positional API:

```python
aiter.fused_qknorm_idxrqknorm(
    ...
    sparse_metadata.slot_mapping,  # slot_mapping
    self.kv_cache,                 # OLD arg 14: kv_cache (full [blocks,2,...] tensor)
    self.index_cache,              # OLD arg 15: index_cache
    self.kv_cache.shape[2],        # OLD arg 16: block_size (int)
    q,                             # OLD arg 17: q_out
    ...
)
```

But **`aiter/ops/fused_qknorm_idxrqknorm.py`** in the same image now has the **new** signature with a split KV cache:

```python
def fused_qknorm_idxrqknorm(
    ...
    kv_cache_k: Optional[Tensor],  # NEW arg 14
    kv_cache_v: Optional[Tensor],  # NEW arg 15 — inserted here
    index_cache: Optional[Tensor], # NEW arg 16
    block_size: int,               # NEW arg 17
    q_out: Optional[Tensor],       # NEW arg 18
    ...
)
```

The insertion of `kv_cache_v` at position 15 shifts every subsequent argument, causing `self.kv_cache.shape[2]` (an `int`) to land on `index_cache: Optional[Tensor]` → type error.

## Environment

- Image: `rocm/atom-dev:latest` (built 2026-06-24 15:36 UTC)
- ATOM HEAD in image: `ab9eb781`
- GPU: MI355 (4x)
- ROCm: 7.2.4
- Model: `amd/MiniMax-M3-MXFP4`

## Expected

Server starts and serves requests as documented in `recipes/MiniMax-M3.md` (added in PR #1305).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rocm/atom-dev:latest crashes on MiniMax-M3 MXFP4: AIter API mismatch in fused_qknorm_idxrqknorm #1347

Summary

Repro

Error

Root Cause

Environment

Expected

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

rocm/atom-dev:latest crashes on MiniMax-M3 MXFP4: AIter API mismatch in fused_qknorm_idxrqknorm #1347

Description

Summary

Repro

Error

Root Cause

Environment

Expected

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions