-
Notifications
You must be signed in to change notification settings - Fork 14
Update gt4py: support for literal precision #192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update gt4py: support for literal precision #192
Conversation
twicki
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
|
@bensonr if you want to have a look at the newly added documentation here. |
a9651f2 to
0d9ec28
Compare
|
Also, please double check that the default makes sense. E.g. when running in 64-bit mode, is it okay/good to (also) use 64-bit integers, or would you expect integers to be still 32-bit even if floats are 64-bit? Please disregard the |
|
|
||
| !!! note "Supported compilers" | ||
|
|
||
| NDSL currently only works with the GNU compiler. Using `clang` will result in errors related to undefined OpenMP flags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fmalatino - there was work to allow the use of Intel, did that not make it into the release streams?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears not. Was this from Xingqiu's work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will track it down and make a subsequent PR or suggest the changes here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romanc here is a build script that Xingqiu wrote for our post-processing/analysis machine, which sets the flags for using the 2021.3.0 Intel compilers. I am not sure if it makes sense to amend this portion of the docs to indicate that a build and installation is possible with these compilers as well, or if we should hold off, test with that is more current, and make a subsequent PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make a follow-up PR for (intel) compiler support. Think a bit where this might end up in the docs. Here we are in the quickstart section. That's not the place to be super technical. Imo, if we have anything else then "we know this works", then it should be a discussion on a dedicated page that we can just link to from here.
|
|
||
| `Python` | ||
|
|
||
| : The default. Disables full program optimization and only accelerates stencil code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the acceleration under this option via numpy or gt4py only (reading this as a complete newbie to NDSL)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so this is a bit out of scope for what we are doing here. But since we are here, let me give you the high-level overview. Basically, NDSL has two major modes of how it can run.
In the default mode, every with computation(): block is analyzed and an optimized. Depending on the backend, this could mean that the with computation(): block is running in C++ (e.g. in the original GridTools backends or in the dace:cpu backend). For dace:gpu, for example, we run that part of the code on the GPU. If users opt for the numpy backend, we "rewrite" that block of code with building blocks from numpy.
Now, numerical weather prediction (NWP) codes are kind of "fragmented" with many with computation() blocks. That's why NDSL can go a step further than plain GT4Py. In NDSL, we can leverage full program optimization or what we internally call "orchestration". Orchestration will not only analyze and optimize with computation(): blocks but also everything in between. More importantly, we can - in that mode - analyze two (subsequent) with computation(): blocks and decide to merge them into one if that makes sense and is allowed from a semantic point of view.
Our working hypothesis is that this second mode is much more potent for getting to portable performance because it allows us to do large-scale changes to code. However, full program optimization doesn't just magically work out of the box. It only works with the dace:* backends and it might need changes to the science code too.
Long story short: This part obviously needs rewording/rephrasing. I'm just reformatting existing docs here. I'd suggest to do this in a separate PR. Even the above write-up is probably too complicated for complete new users. Docs are currently very much work in progress and I expect sections to move around a lot until we settle on a frist version that we think we can (automatically) deploy. Most likely, the whole section on changing defaults with environment variables doesn't belong on the index page of the user documentation 😉.
I could make the argument that we generally don't need 64-bit integers and setting it as the default will increase memory, but by how much would need to be quantified. |
Okay, can we agree it doesn't do any harm? We can always split |
0d9ec28 to
6b57b59
Compare
twicki
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks still good!
Description
This PR updates gt4py to bring support for specifying int/float literal precision. This is important in the context of "mixed precision work" and will unblock PRs like NOAA-GFDL/PyFV3#32.
The default is to map unspecified integer and floating point variables to 64-bit precision. This can be changed by specifying
NDSL_LITERAL_PRECISION=32, which will map integers and floats to 32-bit precision.For "hard-coding" certain calculations to a fixed precision, one can use
int32(),int64(),float32(),float64()cast operators and/orint32,int64,float32,float64type definitions (for temporaries).How Has This Been Tested?
All good as long as CI is still green.
Checklist: