-
-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Description
Background\n\nThe llama-cpp-power8 project achieving 147 t/s on POWER8 is impressive! The AltiVec/VSX optimization work is clearly paying off.\n\n## Proposed Comparison\n\nWould be valuable to see a POWER8 vs POWER9 vs POWER10 performance comparison:\n\n| System | Tokens/sec | Multiplier | Notes |\n|--------|------------|------------|-------|\n| POWER8 S824 | 147 t/s | 1.0x | Baseline |\n| POWER9 AC922 | ??? | ?x | With VSX3 |\n| POWER10 E1080 | ??? | ?x | With VSX4 |\n\n## Why This Matters\n\n1. Quantifies generational improvements - How much does each POWER gen gain?\n2. Validates optimization strategy - Are VSX improvements linear?\n3. Guides hardware recommendations - Best price/performance for LLM inference\n\n## Test Setup Suggestion\n\n- Same model (TinyLlama 1.1B Q4_K)\n- Same batch size\n- Same memory configuration\n- Compare: stock llama.cpp vs POWER-optimized\n\nThis data would be valuable for the retro-computing + AI community! ??
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels