h200
Here are 5 public repositories matching this topic...
Thermal-aware batch controller for vLLM/TensorRT-LLM. Prevents HBM thermal throttling from killing p99 latency on H100/H200. Monitors nvidia-smi, auto-cuts batch size at 85°C, migrates cold KV to DRAM. Prometheus + Grafana included. 4.2s -> 2.1s p99 at 128K context.
-
Updated
Apr 13, 2026 - Python
Monitor low-utilization time, idle-state episodes, and workload starvation signals on NVIDIA datacenter GPUs.
-
Updated
Apr 3, 2026 - Python
dd-ready fully packed FreeDOS disk image for crossfalshing SAS2008 like Dell H200 to HBA
-
Updated
May 9, 2023
⚡ TIMTEH Model Forge — Uncensored, abliterated & reasoning-distilled GGUFs. Forged on 8×H200 SXM5 | 1.1TB VRAM
-
Updated
Mar 27, 2026 - Shell
Improve this page
Add a description, image, and links to the h200 topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the h200 topic, visit your repo's landing page and select "manage topics."