Commit 9d0b328
committed
Enable MiniMax-M3 MXFP4 (AttnFP8) on top of the BF16 M3 support
The in-tree MiniMax-M3 model already covers the BF16 checkpoint. This
adds the small pieces the quantized amd/MiniMax-M3-MXFP4-AttnFP8 build
needs, without disturbing the BF16 path.
- config.py: register the minimax_m3_vl multimodal wrapper and parse its
text sub-config (which declares no model_type) with the base
PretrainedConfig so every field is retained and no deepseek/MLA
defaults leak in; stamp model_type=minimax_m3 from the top-level type.
The quark quantization_config (already propagated from the root) and
the original architectures are preserved, so loading resolves to the
existing MiniMaxM3Sparse model. The BF16 checkpoint keeps its direct
minimax_m3 model_type and is unaffected.
- linear.py: pad the MXFP4 Linear contraction dim to 256. The a4w4 asm
GEMM reads K in 256-wide tiles, so an unaligned K (M3's shared-expert
down_proj at TP=8, K=384) faults on GPU. LinearBase._pad_mxfp4_input_dim()
zero-pads the fp4x2 weight, its e8m0 scale, and the activation up to
256-alignment; no-op when already aligned.1 parent fc4d766 commit 9d0b328
2 files changed
Lines changed: 61 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
586 | 586 | | |
587 | 587 | | |
588 | 588 | | |
| 589 | + | |
589 | 590 | | |
590 | 591 | | |
591 | 592 | | |
| |||
630 | 631 | | |
631 | 632 | | |
632 | 633 | | |
633 | | - | |
634 | | - | |
635 | | - | |
636 | | - | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
637 | 648 | | |
638 | 649 | | |
639 | 650 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
542 | 542 | | |
543 | 543 | | |
544 | 544 | | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
545 | 585 | | |
546 | 586 | | |
547 | 587 | | |
| |||
580 | 620 | | |
581 | 621 | | |
582 | 622 | | |
| 623 | + | |
583 | 624 | | |
584 | 625 | | |
585 | 626 | | |
| |||
591 | 632 | | |
592 | 633 | | |
593 | 634 | | |
| 635 | + | |
| 636 | + | |
594 | 637 | | |
595 | 638 | | |
596 | 639 | | |
| |||
688 | 731 | | |
689 | 732 | | |
690 | 733 | | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
691 | 737 | | |
692 | 738 | | |
693 | 739 | | |
| |||
0 commit comments