Update gt4py: support for literal precision #192

romanc · 2025-08-08T06:54:05Z

Description

This PR updates gt4py to bring support for specifying int/float literal precision. This is important in the context of "mixed precision work" and will unblock PRs like NOAA-GFDL/PyFV3#32.

The default is to map unspecified integer and floating point variables to 64-bit precision. This can be changed by specifying NDSL_LITERAL_PRECISION=32, which will map integers and floats to 32-bit precision.

For "hard-coding" certain calculations to a fixed precision, one can use int32(), int64(), float32(), float64() cast operators and/or int32, int64, float32, float64 type definitions (for temporaries).

How Has This Been Tested?

All good as long as CI is still green.

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas: N/A
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published in downstream modules
New check tests, if applicable, are included: N/A

twicki

Looks good

romanc · 2025-08-11T15:32:56Z

@bensonr if you want to have a look at the newly added documentation here.

romanc · 2025-08-11T16:16:26Z

Also, please double check that the default makes sense. E.g. when running in 64-bit mode, is it okay/good to (also) use 64-bit integers, or would you expect integers to be still 32-bit even if floats are 64-bit?

Please disregard the tmq file in the docs. We used it for testing the merge queue and it didn't work well, so the file ended up in develop 🙈 The file will be removed with PR #198.

bensonr · 2025-08-11T16:17:03Z

docs/quickstart.md

+
+!!! note "Supported compilers"
+
+    NDSL currently only works with the GNU compiler. Using `clang` will result in errors related to undefined OpenMP flags.


@fmalatino - there was work to allow the use of Intel, did that not make it into the release streams?

It appears not. Was this from Xingqiu's work?

Yes, that's it.

I will track it down and make a subsequent PR or suggest the changes here.

@romanc here is a build script that Xingqiu wrote for our post-processing/analysis machine, which sets the flags for using the 2021.3.0 Intel compilers. I am not sure if it makes sense to amend this portion of the docs to indicate that a build and installation is possible with these compilers as well, or if we should hold off, test with that is more current, and make a subsequent PR.

Let's make a follow-up PR for (intel) compiler support. Think a bit where this might end up in the docs. Here we are in the quickstart section. That's not the place to be super technical. Imo, if we have anything else then "we know this works", then it should be a discussion on a dedicated page that we can just link to from here.

bensonr · 2025-08-11T16:20:55Z

docs/user/index.md

+
+`Python`
+
+:   The default. Disables full program optimization and only accelerates stencil code.


Is the acceleration under this option via numpy or gt4py only (reading this as a complete newbie to NDSL)?

Okay, so this is a bit out of scope for what we are doing here. But since we are here, let me give you the high-level overview. Basically, NDSL has two major modes of how it can run.

In the default mode, every with computation(): block is analyzed and an optimized. Depending on the backend, this could mean that the with computation(): block is running in C++ (e.g. in the original GridTools backends or in the dace:cpu backend). For dace:gpu, for example, we run that part of the code on the GPU. If users opt for the numpy backend, we "rewrite" that block of code with building blocks from numpy.

Now, numerical weather prediction (NWP) codes are kind of "fragmented" with many with computation() blocks. That's why NDSL can go a step further than plain GT4Py. In NDSL, we can leverage full program optimization or what we internally call "orchestration". Orchestration will not only analyze and optimize with computation(): blocks but also everything in between. More importantly, we can - in that mode - analyze two (subsequent) with computation(): blocks and decide to merge them into one if that makes sense and is allowed from a semantic point of view.

Our working hypothesis is that this second mode is much more potent for getting to portable performance because it allows us to do large-scale changes to code. However, full program optimization doesn't just magically work out of the box. It only works with the dace:* backends and it might need changes to the science code too.

Long story short: This part obviously needs rewording/rephrasing. I'm just reformatting existing docs here. I'd suggest to do this in a separate PR. Even the above write-up is probably too complicated for complete new users. Docs are currently very much work in progress and I expect sections to move around a lot until we settle on a frist version that we think we can (automatically) deploy. Most likely, the whole section on changing defaults with environment variables doesn't belong on the index page of the user documentation 😉.

docs/user/index.md

bensonr · 2025-08-11T16:26:07Z

Also, please double check that the default makes sense. E.g. when running in 64-bit mode, is it okay/good to (also) use 64-bit integers, or would you expect integers to be still 32-bit even if floats are 64-bit?

I could make the argument that we generally don't need 64-bit integers and setting it as the default will increase memory, but by how much would need to be quantified.

romanc · 2025-08-12T07:05:54Z

Also, please double check that the default makes sense. E.g. when running in 64-bit mode, is it okay/good to (also) use 64-bit integers, or would you expect integers to be still 32-bit even if floats are 64-bit?

I could make the argument that we generally don't need 64-bit integers and setting it as the default will increase memory, but by how much would need to be quantified.

Okay, can we agree it doesn't do any harm? We can always split NDSL_LITERAL_PRECISION into NDSL_LITERAL_INT_PRECISION and NDSL_LITERAL_FLOAT_PRECISION later if we find the memory overhead to be too high. I'd like to keep the amount of environment variables as low as possible until we have an actual need for them.

twicki

Looks still good!

romanc requested a review from twicki August 8, 2025 06:55

romanc marked this pull request as ready for review August 8, 2025 07:11

This was referenced Aug 8, 2025

Deprecate PACE_* environment variables in favor of NDSL_* #193

Merged

ci: configure tests to run on merge-queue #194

Merged

twicki previously approved these changes Aug 8, 2025

View reviewed changes

romanc dismissed twicki’s stale review via bdeef7c August 11, 2025 07:29

romanc requested a review from twicki August 11, 2025 08:46

fmalatino requested a review from bensonr August 11, 2025 15:32

romanc force-pushed the romanc/update-gt4py-literal-precision branch from a9651f2 to 0d9ec28 Compare August 11, 2025 16:08

romanc requested a review from FlorianDeconinck August 11, 2025 16:16

bensonr reviewed Aug 11, 2025

View reviewed changes

romanc added 4 commits August 12, 2025 09:06

Update gt4py: support for literal precision

ec845a6

Actually forwarding literal precision to gt4py

4e5946c

Exposing type casts and new math functions

ed51877

Documentation update

6b57b59

romanc force-pushed the romanc/update-gt4py-literal-precision branch from 0d9ec28 to 6b57b59 Compare August 12, 2025 07:08

twicki approved these changes Aug 13, 2025

View reviewed changes

romanc added this pull request to the merge queue Aug 13, 2025

Merged via the queue into NOAA-GFDL:develop with commit 14b9e78 Aug 13, 2025
5 checks passed

romanc mentioned this pull request Aug 18, 2025

No support for mixed precision externals #195

Closed

romanc deleted the romanc/update-gt4py-literal-precision branch August 20, 2025 08:00


		!!! note "Supported compilers"

		NDSL currently only works with the GNU compiler. Using `clang` will result in errors related to undefined OpenMP flags.


		`Python`

		: The default. Disables full program optimization and only accelerates stencil code.

Update gt4py: support for literal precision #192

Update gt4py: support for literal precision #192

Uh oh!

Conversation

romanc commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

twicki left a comment

Choose a reason for hiding this comment

Uh oh!

romanc commented Aug 11, 2025

Uh oh!

romanc commented Aug 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bensonr commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

romanc commented Aug 12, 2025

Uh oh!

twicki left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

romanc commented Aug 8, 2025 •

edited

Loading

bensonr commented Aug 11, 2025 •

edited

Loading