Notes and scripts for AMD profiling of dycore#1047
Draft
iomaganaris wants to merge 51 commits intomainfrom
Draft
Notes and scripts for AMD profiling of dycore#1047iomaganaris wants to merge 51 commits intomainfrom
iomaganaris wants to merge 51 commits intomainfrom
Conversation
havogt
reviewed
Feb 6, 2026
havogt
reviewed
Feb 6, 2026
havogt
reviewed
Feb 6, 2026
havogt
reviewed
Feb 6, 2026
model/atmosphere/dycore/tests/dycore/integration_tests/test_benchmark_solve_nonhydro.py
Outdated
Show resolved
Hide resolved
havogt
reviewed
Feb 6, 2026
model/atmosphere/dycore/tests/dycore/integration_tests/test_benchmark_solve_nonhydro.py
Outdated
Show resolved
Hide resolved
havogt
reviewed
Feb 6, 2026
havogt
reviewed
Feb 6, 2026
havogt
reviewed
Feb 6, 2026
install_icon4py_uenv.sh
Outdated
| fi | ||
|
|
||
| # Install icon4py, gt4py, DaCe and other basic dependencies using uv | ||
| uv sync --extra all --python $(which python3.12) |
Contributor
There was a problem hiding this comment.
I would not install all the extras but maybe we properly add cupy-rocm7 as an extra to avoid line 29. I can work on that.
havogt
reviewed
Feb 6, 2026
havogt
reviewed
Feb 6, 2026
havogt
reviewed
Feb 6, 2026
Co-authored-by: Till Ehrengruber <till.ehrengruber@cscs.ch>
…osure_vars to fix the caching of the dycore programs
havogt
reviewed
Feb 9, 2026
amd_scripts/benchmark_dycore.sh
Outdated
| --benchmark-warmup=on \ | ||
| --benchmark-warmup-iterations=30 \ | ||
| --backend=dace_gpu \ | ||
| --grid=icon_benchmark_regional \ |
Contributor
There was a problem hiding this comment.
Suggested change
| --grid=icon_benchmark_regional \ | |
| --grid=icon_benchmark_global \ |
Since global is our main target for now, maybe we can switch to that.
|
Mandatory Tests Please make sure you run these tests via comment before you merge!
Optional Tests To run benchmarks you can use:
To run tests and benchmarks with the DaCe backend you can use:
To run test levels ignored by the default test suite (mostly simple datatest for static fields computations) you can use:
For more detailed information please look at CI in the EXCLAIM universe. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This Pull Request includes scripts to benchmark and profile the
dycore granuleas well as one of the most time consumingGT4Py Programs of it, thevertically_implicit_solver_at_predictor_step.We'll keep this PR open for interaction and keep it up-to-date with improvements.
The PR includes the following important files:
AMD_INTRODUCTION.md: Includes (hopefully) all the informations necessary to run the benchmark scripts for thedycore granuleand thevertically_implicit_solver_at_predictor_stepas well as an introduction onicon4py,GT4PyandDaCe. There are also some suggestions regarding how to view and understand the generated codeamd_scripts/install_icon4py_venv.sh: Script to installicon4pyalong with all the dependencies necessary to run the profilersamd_scripts/benchmark_dycore.sh: Sbatch script forBeverinto run and time theGT4Py Programs of thedycoreamd_scripts/benchmark_solver.sh: Sbatch script forBeverinto benchark and profile thevertically_implicit_solver_at_predictor_step. Looking at the profiles of the kernels generated by thisGT4Py programis the most interesting topic as it should improve the performance across most of the otherdycoreGT4Py Programs as wellCurrently, based on #1018 which points to GT4Py/main (which will become GT4Py v1.1.4 in the next week).