Skip to content

Add 500 hPa geopotential height variable to GEFS datasets#490

Merged
aldenks merged 6 commits intomainfrom
claude/add-500hpa-geopotential-gefs-NawDq
Mar 6, 2026
Merged

Add 500 hPa geopotential height variable to GEFS datasets#490
aldenks merged 6 commits intomainfrom
claude/add-500hpa-geopotential-gefs-NawDq

Conversation

@aldenks
Copy link
Member

@aldenks aldenks commented Mar 6, 2026

Summary

This PR adds support for the 500 hPa geopotential height variable to both the GEFS forecast (35-day) and analysis datasets.

Key Changes

  • Added geopotential_height_500hpa variable configuration in common_gefs_template_config.py with:

    • GRIB element mapping: "HGT" at "500 mb" isobaric surface
    • Float32 encoding with 11 mantissa bits precision
    • Proper metadata attributes (long_name, units, standard_name, etc.)
    • Support for both surface analysis ("s+a") file types
  • Updated Zarr metadata templates for both datasets:

    • Forecast 35-day: Added array definition with 5D shape (init_time, ensemble_member, lead_time, latitude, longitude)
    • Analysis: Added array definition with 3D shape (time, latitude, longitude)
    • Both include sharding_indexed codec with zstd compression (clevel 3)
  • Extended GRIB level mapping in gefs_config_models.py to recognize "500 mb" as a pressure level identifier

Implementation Details

  • The variable uses consistent encoding across both datasets (float32 with appropriate chunk sizes)
  • Compression strategy employs blosc with zstd codec for efficient storage
  • Metadata includes CF-compliant standard names and coordinate references
  • Index position 34 in the GEFS file structure for proper data extraction

https://claude.ai/code/session_01LBd3sDwYwxRihWA2hLLnPj

@aldenks aldenks linked an issue Mar 6, 2026 that may be closed by this pull request
6 tasks
claude added 3 commits March 6, 2026 04:11
- Add geopotential_height_500hpa data variable to shared GEFS template config
  with gefs_file_type="a" (a-files only; 500 hPa not in s-files), correct
  grib_description using Pascal units ('50000[Pa] ISBL="Isobaric surface"'),
  and index_position=31
- Add "500 mb": "pres_abv700mb" to GEFS_REFORECAST_LEVELS_SHORT so reforecast
  URLs resolve to the hgt_pres_abv700mb_* files (500 hPa is not in hgt_pres_*)
- Regenerate zarr templates for both datasets
- Replace single-variable slow tests with 4 comprehensive all-vars integration
  tests per dataset: reforecast, pre-v12, current early lead, current later
  lead (forecast only); uses source_groups() with full groups per test,
  skipping vars absent from the reforecast archive via FileNotFoundError

https://claude.ai/code/session_01LBd3sDwYwxRihWA2hLLnPj
Some variables aren't present in pre-v12 or reforecast GRIB files (e.g.
cloud ceiling HGT, PRMSL). Catch FileNotFoundError, ValueError, and
AssertionError to skip missing vars rather than failing the test.

https://claude.ai/code/session_01LBd3sDwYwxRihWA2hLLnPj
read_rasterio: reforecast pressure-level files (e.g. hgt_pres_abv700mb)
are 0.5deg while surface files are 0.25deg. When the reforecast file
resolution doesn't match the output grid, reproject it the same way as
a/b-files instead of asserting equal shape. This fixes geopotential_
height_500hpa in the GEFSv12 reforecast period (2000-2019).

Tests: replace broad exception catches with explicit frozenset allow-lists
(_REFORECAST_MISSING_VARS, _PRE_V12_MISSING_VARS) so unexpected failures
are caught rather than silently skipped.

https://claude.ai/code/session_01LBd3sDwYwxRihWA2hLLnPj
- DRY read_data.py: merge "a"|"b"|"reforecast" into one case with a
  single reproject block; separate "s" case with assert reader.shape
  == out_spatial_shape restored
- Remove reforecast and pre-v12 forecast download/read tests (dataset
  starts 2020-10-01 with GEFSv12)
- Consolidate current early/later lead forecast tests into one
  parametrized test with ThreadPoolExecutor over source groups
- Add ThreadPoolExecutor parallelism to all analysis download/read tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@aldenks aldenks merged commit 715b122 into main Mar 6, 2026
5 checks passed
@aldenks aldenks deleted the claude/add-500hpa-geopotential-gefs-NawDq branch March 6, 2026 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add 500hPa geopotential to GEFS datasets

2 participants