xe: jit: gemm: TLB warmup #2631

petercad · 2025-02-07T22:21:03Z

New solution to the problem in #2607 / MFDNN-12523. This PR improves cold-TLB performance of specific TN GEMV compressed weights kernels on MTL/ARL by adding an extra warmup workgroup whose only job is to probe every 64k page in the A/B matrices (along with scales/zp if present). In case of a TLB miss these probes will initiate page walks that will fill the STLB so that the other workgroups (doing the real work) will not incur the latency penalties from page walks later.

Shows 15-20% speedup on many cases of interest.

petercad · 2025-02-07T22:21:30Z

make test
disable test_device_cpu
disable build_cpu_runtime_omp
disable build_cpu_runtime_sycl
disable build_cpu_runtime_tbb
disable arch_gpu_xe-hpc
disable arch_gpu_xe-lp
disable arch_gpu_xe2-hpg-bmg
disable benchdnn_all
enable benchdnn_matmul

petercad · 2025-02-07T22:21:38Z

make test perf-gpu
set primitive=matmul
disable arch_gpu_xe-hpc
disable arch_gpu_xe-lp
disable arch_gpu_xe2-hpg-bmg
disable arch_gpu_xe2-lpg
disable arch_gpu_xe3-lpg

petercad · 2025-02-08T00:05:47Z

make test
disable test_device_cpu
disable build_cpu_runtime_omp
disable build_cpu_runtime_sycl
disable build_cpu_runtime_tbb
disable arch_gpu_xe-hpc
disable arch_gpu_xe-lp
disable arch_gpu_xe2-hpg-bmg
disable benchdnn_all
enable benchdnn_matmul

petercad · 2025-02-08T00:05:59Z

make test perf-gpu
set primitive=matmul
disable arch_gpu_xe-hpc
disable arch_gpu_xe-lp
disable arch_gpu_xe2-hpg-bmg
disable arch_gpu_xe2-lpg
disable arch_gpu_xe3-lpg

petercad requested a review from a team as a code owner February 7, 2025 22:21

github-actions bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Feb 7, 2025

petercad mentioned this pull request Feb 7, 2025

xe: jit: gemm: tile scrambling for improved cold-TLB performance #2607

Closed

petercad mentioned this pull request Feb 7, 2025

xe: jit: gemm: TLB warmup #2632

Open

petercad force-pushed the petercad/tlb_warmup branch from 61699e9 to 090c941 Compare February 7, 2025 22:38

petercad added 2 commits February 7, 2025 16:01

xe: jit: gemm: TLB warmup support

65fd254

xehpg: jit: gemm: use TLB warmup for dot kernels

f700f63

petercad force-pushed the petercad/tlb_warmup branch from 090c941 to f700f63 Compare February 8, 2025 00:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xe: jit: gemm: TLB warmup #2631

xe: jit: gemm: TLB warmup #2631

petercad commented Feb 7, 2025

petercad commented Feb 7, 2025

petercad commented Feb 7, 2025

petercad commented Feb 8, 2025

petercad commented Feb 8, 2025

xe: jit: gemm: TLB warmup #2631

Are you sure you want to change the base?

xe: jit: gemm: TLB warmup #2631

Conversation

petercad commented Feb 7, 2025

petercad commented Feb 7, 2025

petercad commented Feb 7, 2025

petercad commented Feb 8, 2025

petercad commented Feb 8, 2025