|
| 1 | +# hws - Hardware Sampling for CPUs and GPUs |
| 2 | + |
| 3 | +The Hardware Sampling (hws) library can be used to track hardware performance like clock frequency, memory usage, temperatures, or power draw. |
| 4 | +It currently supports CPUs as well as GPUs from NVIDIA, AMD, and Intel. |
| 5 | + |
| 6 | +## Getting Started |
| 7 | + |
| 8 | +### Dependencies |
| 9 | + |
| 10 | +General dependencies: |
| 11 | + |
| 12 | +- a C++20 capable compiler supporting `std::format` (tested with GCC 14.1.0) |
| 13 | +- [Pybind11 > v2.13.1](https://github.com/pybind/pybind11) if Python bindings are enabled (automatically build during the CMake configuration if it couldn't be found using the respective `find_package` call) |
| 14 | + |
| 15 | +Dependencies based on the hardware to sample: |
| 16 | + |
| 17 | +- if a CPU should be targeted: at least one of [`turbostat`](https://www.linux.org/docs/man8/turbostat.html) (may require root privileges), [`lscpu`](https://man7.org/linux/man-pages/man1/lscpu.1.html), or [`free`](https://man7.org/linux/man-pages/man1/free.1.html) and the [`subprocess.h`](https://github.com/sheredom/subprocess.h) library (automatically build during the CMake configuration if it couldn't be found using the respective `find_package` call) |
| 18 | +- if an NVIDIA GPU should be targeted: NVIDIA's Management Library [`NVML`](https://docs.nvidia.com/deploy/nvml-api/) |
| 19 | +- if an AMD GPU should be targeted: AMD's ROCm SMI library [`rocm_smi_lib`](https://rocm.docs.amd.com/projects/rocm_smi_lib/en/latest/doxygen/html/modules.html) |
| 20 | +- if an Intel GPU should be targeted: Intel's [`Level Zero library`](https://spec.oneapi.io/level-zero/latest/core/INTRO.html) |
| 21 | + |
| 22 | +### Building hws |
| 23 | + |
| 24 | +To download the hardware sampling use: |
| 25 | + |
| 26 | +```bash |
| 27 | +git clone [email protected]:SC-SGS/hardware_sampling.git |
| 28 | +cd hardware_sampling |
| 29 | +``` |
| 30 | + |
| 31 | +Building the library can be done using the normal CMake approach: |
| 32 | + |
| 33 | +```bash |
| 34 | +mkdir build && cd build |
| 35 | +cmake -DCMAKE_BUILD_TYPE=Release [optional_options] .. |
| 36 | +cmake --build . -j |
| 37 | +``` |
| 38 | + |
| 39 | +#### Optional CMake Options |
| 40 | + |
| 41 | +The `[optional_options]` can be one or multiple of: |
| 42 | + |
| 43 | +- `HWS_ENABLE_ERROR_CHECKS=ON|OFF` (default: `OFF`): enable sanity checks during hardware sampling, may be problematic with smaller sample intervals |
| 44 | +- `HWS_SAMPLING_INTERVAL=100ms` (default: `100ms`): set the sampling interval in milliseconds |
| 45 | +- `HWS_ENABLE_PYTHON_BINDINGS=ON|OFF` (default: `ON`): enable Python bindings |
| 46 | + |
| 47 | +### Installing |
| 48 | + |
| 49 | +The library supports the `install` target: |
| 50 | + |
| 51 | +```bash |
| 52 | +cmake --install . --prefix "/home/myuser/installdir" |
| 53 | +``` |
| 54 | + |
| 55 | +Afterward, the necessary exports should be performed: |
| 56 | + |
| 57 | +```bash |
| 58 | +export CMAKE_PREFIX_PATH=${CMAKE_INSTALL_PREFIX}/share/hardware_sampling/cmake:${CMAKE_PREFIX_PATH} |
| 59 | +export LD_LIBRARY_PATH=${CMAKE_INSTALL_PREFIX}/lib:${LD_LIBRARY_PATH} |
| 60 | +export CPLUS_INCLUDE_PATH=${CMAKE_INSTALL_PREFIX}/include:${CPLUS_INCLUDE_PATH} |
| 61 | +export PYTHONPATH=${CMAKE_INSTALL_PREFIX}/lib:${PYTHONPATH} |
| 62 | +``` |
| 63 | + |
| 64 | +## Example Python usage |
| 65 | + |
| 66 | +```python |
| 67 | +import HardwareSampling |
| 68 | +import numpy as np |
| 69 | +import matplotlib.pyplot as plt |
| 70 | +import datetime |
| 71 | + |
| 72 | +sampler = HardwareSampling.CpuHardwareSampler() |
| 73 | +# could also be, e.g., |
| 74 | +# sampler = HardwareSampling.GpuNvidiaHardwareSampler() |
| 75 | +sampler.start() |
| 76 | + |
| 77 | +sampler.add_event("init") |
| 78 | +A = np.random.rand(2**14, 2**14) |
| 79 | +B = np.random.rand(2**14, 2**14) |
| 80 | + |
| 81 | +sampler.add_event("matmul") |
| 82 | +C = A @ B |
| 83 | + |
| 84 | +sampler.stop() |
| 85 | +sampler.dump_yaml("track.yaml") |
| 86 | + |
| 87 | +# plot the results |
| 88 | +time_points = sampler.time_points() |
| 89 | +relative_time_points = [(t - time_points[0]) / datetime.timedelta(milliseconds=1) for t in time_points] |
| 90 | + |
| 91 | +plt.plot(relative_time_points, sampler.clock_samples().get_average_frequency(), label="average") |
| 92 | +plt.plot(relative_time_points, sampler.clock_samples().get_average_non_idle_frequency(), label="average non-idle") |
| 93 | + |
| 94 | +axes = plt.gcf().axes[0] |
| 95 | +x_bounds = axes.get_xlim() |
| 96 | +for event in sampler.get_events()[1:-1]: |
| 97 | + tp = (event.time_point - time_points[0]) / datetime.timedelta(milliseconds=1) |
| 98 | + |
| 99 | + axes.axvline(x=tp, color='r') |
| 100 | + axes.annotate(text=event.name, xy=(((tp - x_bounds[0]) / (x_bounds[1] - x_bounds[0])), 1.025), xycoords='axes fraction', rotation=270) |
| 101 | + |
| 102 | +plt.xlabel("runtime [ms]") |
| 103 | +plt.ylabel("clock frequency [MHz]") |
| 104 | +plt.legend() |
| 105 | +plt.show() |
| 106 | +``` |
| 107 | + |
| 108 | +<p align="center"> |
| 109 | + <img alt="example frequency plot" src=".figures/clock_frequency.png" width="50%"> |
| 110 | +</p> |
| 111 | + |
| 112 | +## License |
| 113 | + |
| 114 | +The hws library is distributed under the [MIT license](https://github.com/SC-SGS/hardware_sampling/blob/main/LICENSE.md). |
0 commit comments