Skip to content

YAML and Python script for the SEV ASR BUFR#55

Open
PraveenKumar-NOAA wants to merge 30 commits intodevelopfrom
feature/obs_builder_sevasr
Open

YAML and Python script for the SEV ASR BUFR#55
PraveenKumar-NOAA wants to merge 30 commits intodevelopfrom
feature/obs_builder_sevasr

Conversation

@PraveenKumar-NOAA
Copy link
Copy Markdown
Contributor

@PraveenKumar-NOAA PraveenKumar-NOAA commented Jun 28, 2025

This PR introduces the following files for the SEV ASR BUFR converter:

~/dump/mapping/bufr_sevasr.yaml
~/dump/mapping/bufr_sevasr.py
~/ush/test/config/bufr_bufr4backend_sevasr.yaml
~/ush/test/config/bufr_script4backend_sevasr.yaml

Testing and Validation

All four workflows (bufr2netcdf, bufr4backend, script2netcdf, and script4backend) were executed and tested using serial execution.

Validation tests were completed for the following variables:

-- satelliteZenithAngle
-- cloudAmount

The validation was performed by comparing the input BUFR files and the corresponding output IODA files. Additional details can be found in #51.

Python Coding Standards

The spoc_python_coding_norms test was run to check compliance with Pycodestyle standards.
Result: All checks passed successfully (100% pass, 0 failures).

Variable Naming Validation

rmclaren and others added 30 commits February 25, 2025 09:40
…ow pycodestye (#30)

**This PR updates two things:**
- Update mapping file and Python script for IASI, CrIS , ATMS 

   Update the following to IASI:**
   1. Rename mtiasi to iasi
   2. Change logging to self.log.warning
   3. Add map_path function

   Add new data type - cris-fsr**

   Update the following for ATMS:**
   1. Change logging to self.log.warning
   2. Add map_path function

- Add Python coding norm check (pycodestyle) --- this is an initial
implementation so we
have a coding norm check for Python script. This will be updated to use
pytest in another PR
   To use it:
   - Load obsforge modules (~eliu/modules/env_obsforge.sh)
   - go to ./sorc/spoc/build
   - ctest -VV -R spoc_python_coding_norms

- Modify all Python scripts to follow pycodestyle (ctest to be added in
another PR soon)
   (No change in science)
This PR addes two files for ASCAT
- mapping YAML
- Python  script  --- pass Python coding norm check

Notes:
- This PR should be tested with [bufr-query PR
#73](NOAA-EMC/bufr-query#73)
- A new PR will be opened to modify the standalone AMV converters
accordingly after this PR is merged.

ASCAT data processing has the following features:
- Convert wind speed and direction to u and v wind components
- Add obs type (290)

Documentation is in Sphinx style 

Test: script2netcdf validated
Validation plots

IODA windEastward 


![gdas_ObsValue_windEastward_satwnd_ascat_metop-b](https://github.com/user-attachments/assets/3d71ed27-1d94-4df6-a3e1-a230dca6a9d1)

GSI windEastward


![gdas_ObsValue_windEastward_ascatw_ascat_metop-b](https://github.com/user-attachments/assets/35473373-4236-4031-985b-0c118529315c)


IODA windNorthward


![gdas_ObsValue_windNorthward_satwnd_ascat_metop-b](https://github.com/user-attachments/assets/a753cdfe-2bac-461f-a418-078ac0ceaa41)


GSI windNorthward

![gdas_ObsValue_windNorthward_ascatw_ascat_metop-b](https://github.com/user-attachments/assets/5e49b097-007a-46d7-a9c2-0c13274fd819)

---------

Co-authored-by: Emily Liu <eliu@hercules-login-4.hpc.msstate.edu>
Co-authored-by: Emily Liu <eliu@hercules-login-3.hpc.msstate.edu>
This PR adds two files for SSMIS
- mapping YAML
- Python script   --- pass Python coding norm test

Notes:
- This PR should be tested with [bufr-query PR
#74](NOAA-EMC/bufr-query#74)

SSMIS data processing has the following features:
- spatial averaging (this is implemented in bufr-query variable
transform) and it can be used by activated the feature from YAML
- Add new variables via the obs builder Python API
- satellite ascending/descending orbit flag (1 for ascending and -1 for
descending)
    - solar zenith and azimuth angles (use pysolar)

Notes:
- Documentation is in Sphnix style (following discussion in Issue #35 )
- SSMIS data conversion includes spatial averaging. Therefore the
processing can only run with one MPI task.
- The solar angles calculation is done using pysolar. The functions used
do not support vectors. Therefore, the calculation of solar angles are
implemented with multiprocessing (e.g., one MPI task and spawned with 12
CPUs)

   ```
             srun -n 1 --cpus_per_tasks=12 python bufr_ssmis.py
   ```
   or 
   ```
             export CPUS_PER_TASKS=12
             python bufr_ssmis.py``
   ```
   
Test: script2netcdf validated
Plots for output netCDF from SSMIS converter   
 

![gdas_solarZenithAngle_ssmis_f17_channel_4](https://github.com/user-attachments/assets/8a90a7c6-9f69-435a-8d15-46b3aa6e698d)


![gdas_satelliteAscendingFlag_ssmis_f17_channel_4](https://github.com/user-attachments/assets/c583697f-7ce7-435c-969b-1ef953fcd090)


![gdas_ObsValue_ssmis_f17_channel_4](https://github.com/user-attachments/assets/74e80396-4865-4ffa-a18f-3bf822ff98d9)

---------

Co-authored-by: Emily Liu <eliu@hercules-login-3.hpc.msstate.edu>
Co-authored-by: Emily Liu <eliu@hercules-login-4.hpc.msstate.edu>
### This PR updates the YAML and Python scripts for SSMIS ATMS, IASI and
CrIS

For all four sensors:
1. Simplify the handling of missing categories using the
`add_dummy_variable` function
2. Use `map_path` from import.obs_builder for ATMS, CrIS, and IASI
3. Remove "type: netcdf" from the encoder section in the YAML
4. Updates `globals` in YAML
5. The output files were validated against the IODA validator with
ObsSpace.yaml; new variables were added to obsSpace.yaml. This validator
checks if the variables in the file follow IODA convention in terms of
name, unit, and dimension.

Here is an example of the output from running the ioda validator for
SSMIS output
```
Reading YAML from ../obsForge/sorc/ioda/share/ioda/yaml/validation/ObsSpace.yaml
Processing data file: ./testoutput/2024021900/script2netcdf/gdas.t00z.ssmis_f17.tm00.nc
 Verifying that all required groups exist
 Verifying group MetaData
 Verifying group ObsValue
 Verifying group /
 Verifying dimension names
 Verifying variable information
  Variable MetaData/earthSurfaceType
  Variable MetaData/sensorZenithAngle
  Variable MetaData/satelliteIdentifier
  Variable MetaData/sensorScanPosition
  Variable MetaData/longitude
  Variable MetaData/latitude
  Variable MetaData/sensorViewAngle
  Variable MetaData/scanLineNumber
  Variable MetaData/solarZenithAngle
  Variable MetaData/sensorChannelNumber
  Variable MetaData/satelliteAscendingFlag
  Variable MetaData/sensorAzimuthAngle
  Variable MetaData/rainFlag
  Variable MetaData/dateTime
  Variable MetaData/solarAzimuthAngle
  Variable ObsValue/brightnessTemperature
Final results:
  # errors:      0
  # warnings:    0

```

Notes:
- SSMIS specific - ssmis conversion uses updated `comupte_solar_angles`
which is under review in [bufr-query PR
#96](NOAA-EMC/bufr-query#96)
- The update of ObsSpace.yaml from IODA is ongoing. We will add new
variables in the [ObsSpace.yaml
](https://github.com/JCSDA-internal/ioda/blob/feature/ObsSpace_Validator/share/ioda/yaml/validation/ObsSpace.yaml)in
IODA branch feature/ObsSpace_Validator

Specific to IASI and CrIS
Modifications made to cloud related variables - `cloudHeight`,
`cloudCoverTotal`, `fractionOfClearInFOV`:
- change group from `MetaData` to `ObsValue`
- change variable type from `int` to `float`
- scale the variables so they range between 0 and 1
- change `ObsValue/radiance` to `ObsValue/spectralRadiance` and unit
accordingly

The above changes are to following IODA convention.



These changes do not alter the results (scientific)

---------

Co-authored-by: Emily Liu <eliu@hercules-login-3.hpc.msstate.edu>
Co-authored-by: Emily Liu <eliu@hercules-login-4.hpc.msstate.edu>
### This PR updates the following:

**Python script for ASCAT:**
- Use add_dummy_variable and map_path
- Consolidate the addition of new variables in one function
(_add_wind_obs)
- Use a dictionary to add dummy variables

**YAML for ASCAT:**
- Make sure the data type of wind directory is float
- Update global attributes (to follow IODA convention)

**Python script for SSMIS:**
- Use a dictionary to add dummy variables

**Python script for adpupa_prepbufr**
- Fix coding norms

```
            dummy_mappings = [
               ('solarZenithAngle', 'latitude'),
               ('solarAzimuthAngle', 'latitude'),
               ('sensorZenithAngle', 'latitude'),
               ('sensorAzimuthAngle','latitude')
            ]
            for target_var, source_var in dummy_mappings:
                add_dummy_variable(container, target_var, category, source_var)

```


Do not change results.

---------

Co-authored-by: Emily Liu <eliu@hercules-login-4.hpc.msstate.edu>
Co-authored-by: Emily Liu <eliu@hercules-login-3.hpc.msstate.edu>
### *This PR modified the satellite AMV converters to use the following:
- Use `compute_wind_components` from `bufr.transforms` import
- Use `map_path` from `bufr.obs_builder` import
- Use `add_dummy_variable` from `bufr.obs_builder (see [bufr-query PR
#76](NOAA-EMC/bufr-query#76 (comment)))
- Add MetaData variables: height and stationElevation to all satellite
wind types (abi, ahi, etc). So, this is implemented in
`bufr_satwnd_amv_obs_builder.py`. These two MetaData do not exist in the
satwnd BUFR, but they are needed for the satwind observation operator in
UFO.

In the future, we do not need to include these two MetaData (height and
station elevation) in the pre-processing. They can be created in UFO via
a variable assignment filter. These two variables are needed for all
satellite wind types, so they are added in the `SatWndAmvObsBuilder`
class.

### A few fixes
**1. Data Type** - The following variables need data type conversion
from integer to float to meet [IODA convention
requirement](https://docs.google.com/spreadsheets/d/e/2PACX-1vRvM7vLTsUZs-xJZHha1vZDQZyFSrybKfNvN9lIFJY6yB2MQDmBH_MofWha2R7uhPl6frKKbGDQmd0g/pubhtml)
- pressure
- sensorCentralFrequency
- qiWithouthForecast
- expectedError
- windDirection

**2. Unit** - The unit of `qiWithoutForecast` (quality information)
should be `percent`
(bufr_satwnd_amv_obsBuilder.sh, bufr_satwnd_amv_abi.yamo.
bufr_satwnd_amv_leogeo.yaml, bufr_satwnd_amv_seviri.yaml)

**3. Quality Information** - pass incorrect quality information due to
memory issue. Need to copy the sliced data to a new variable, remove
squeeze(), and do not assign type to int32 to gnap and qifn (the data
type will be determined by path later when encoding)

 In _get_quality_info_and_gen_app(self, findQi, gnap2D, pccf2D)
 Change the following 
         
```
        for i in range(gDim2):
            if np.unique(gnap2D[:, i].squeeze()) == findQi:
                if i <= qDim2:
                    self.log.info(f'GNAP/PCCF found for column {i}')
                    gnap = gnap2D[:, i].squeeze()
                    qifn = pccf2D[:, i].squeeze()
                else:
                    self.log.info(f'ERROR: GNAP column {i} outside of PCCF dimension {qDim2}')
        if (gnap is None) & (qifn is None):
            raise ValueError(f'GNAP == {findQi} NOT FOUND OR OUT OF PCCF DIMENSION-RANGE, WILL FAIL!')
        # If EE is needed, key search on np.unique(gnap2D[:,i].squeeze()) == 7 instead
        # NOTE: Make sure to return np.float32 or np.int32 types as appropriate!!!
        return gnap.astype(np.int32), qifn.astype(np.int32)
 ```
 to
 ```
        for i in range(gDim2):
            print('emily checking i = ', i)
            if np.all(np.unique(gnap2D[:, i]) == findQi):
                if i < qDim2:
                    self.log.info(f'GNAP/PCCF found for column {i}')
                    gnap = gnap2D[:, i].copy()
                    qifn = pccf2D[:, i].copy()
                else:
                    self.log.info(f'ERROR: GNAP column {i} outside of PCCF dimension {qDim2}')

        if (gnap is None) & (qifn is None):
            raise ValueError(f'GNAP == {findQi} NOT FOUND OR OUT OF PCCF DIMENSION-RANGE, WILL FAIL!')
        # If EE is needed, key search on np.unique(gnap2D[:,i].squeeze()) == 7 instead
        return gnap, qifn
 ```

**4. Query string issue in LEOGEO YAML** - the query string is wrong for `qualityInformation

Here is the bufr table for LEOGEO (NC005072)
```
| NC005072 | GEOSTWND "LGRSQ3"103 GCLONG GNAP "LGRSQ4"10 GCLONG GNAP |
| NC005072 | "LGRSQ4"10 GCLONG GNAP "LGRSQ4"10 |
```
The LGRSQ4 replicating 10 times, and we want to grab the first one.  
The following query will get 2D array out [Location, 10]. This is not what we want.  
```
    # Quality Information Without Forecast
    qualityInformationWithoutForecast:
      query: '*/LGRSQ4[1]/PCCF'
```

2D array [Location, 10]
```
   qiWithoutForecast =
  76, 76, 0, 0, 0, 0, 0, 0, 0, 0,
  67, 67, 0, 0, 0, 0, 0, 0, 0, 0,
  63, 63, 0, 0, 0, 0, 0, 0, 0, 0,
  63, 63, 0, 0, 0, 0, 0, 0, 0, 0,
  86, 86, 0, 0, 0, 0, 0, 0, 0, 0,
  64, 64, 0, 0, 0, 0, 0, 0, 0, 0,
                 : 
                 :

```


We want to extract the first one [Location, 1].  So, the query string should be:
```
    # Quality Information Without Forecast
    qualityInformationWithoutForecast:
      query: '*/LGRSQ4[1]{1}/PCCF'
      type: float
```
1-D array [Location]
``` 
Without Forecast = 76, 67, 63, 63, 86, 64, 60, 74, 89, 71, 63, 70, 87,
88, 75, 66, 80, 83, 64, 71, 84, 88, 71, 82, 54, 69, 72, 86, 70, 66, 88,
66, 64, 66, 90, 81, 65, 84, 63, 59, 67, 63, 73, 70, 72, 69, 72, 93, 81,
86, 66, 54, 61, 63, 94, 64, 63, 56, 74, 66, 57, 75, 67, 69, 99, 98, 97,
100, 81, 89, 90, 58, 73, 97, 64, 94, 62, 66, 57, 70, 89, 85, 87, 83,
85, 78, 78, 78, 85, 74, 90, 74, 70, 79, 79, 57, 100, 63, 66, 78, 77,
83, 95, 82, 88, 66, 87, 64, 65, 74, 65, 59, 67, 86, 63, 63, 69, 65, 81,
86, 76, 64, 99, 81, 76, 56, 83, 88, 79, 62, 70, 83, 86, 64, 60, 74, 70,
87, 80, 74, 63, 95, 91, 96, 89, 61, 67, 60, 71, 77, 67, 69, 57, 74, 67,
                   :
                   :
```

Notes:
Sensors use quality information directly from mapping YAML - abi, leogeo, seviri
Sensors need to processing additional information to get quality information - ahi, avhrr, viirs, modis

---------

Co-authored-by: Emily Liu <eliu@hercules-login-2.hpc.msstate.edu>
Co-authored-by: Emily Liu <eliu@hercules-login-3.hpc.msstate.edu>
Co-authored-by: Emily Liu <eliu@hercules-login-4.hpc.msstate.edu>
**Description:** 

This PR removes outdated ADPUPA files from the spoc/ and
spoc/dump/mapping/ directories as part of ongoing refactoring.

The following files have been removed:

- spoc/bufr_adpupa_prepbufr.py
- spoc/bufr_adpupa_prepbufr_mapping.yaml
- spoc/dump/mapping/bufr_adpupa_prepbufr.py
- spoc/dump/mapping/bufr_adpupa_prepbufr_mapping.yaml

A new implementation of ADPUPA prepBUFR handling is in progress, and a
separate pull request, #40 for those files is currently under review.

---------

Co-authored-by: Praveen.Kumar <Praveen.Kumar@noaa.gov>
**This PR adds the mapping file and Python script for the following
ozone data:**
- omi aura (total ozone)
- ompstc (total ozone)
- ompsnp (profile ozone)

**1.** BufrOzoneObsBuilder - base class for all ozone obs builder
 - BufrOzone**Omi**ObsBuilder
 - BufrOzone**Ompstc**ObsBuilder
 - BufrOzone**Ompsnp**ObsBuilder

**2.** IODA validator checked!

[ObsSpace.yaml](https://github.com/JCSDA-internal/ioda/blob/feature/ObsSpace_Validator/share/ioda/yaml/validation/ObsSpace.yaml)
(IODA feature/ObsSpace_Validator) is updated to include elements for
ozone-related variables

**3.** Coding Norms checked and passed!

**4.** Sphinx style comments added 

**Notes: Issue to be fixed in the future** (see [bufr-query Issue
#100](NOAA-EMC/bufr-query#100))
The new dimension name is not replaced with the specified name "Vertice"
(stays as the default name: dim_2)

```
netcdf gdas.t00z.ozone.ompsnp_n20.tm00 {
dimensions:
        Location = 32172 ;
        dim_2 = 2 ;
variables:
        int Location(Location) ;
                Location:_FillValue = 2147483647 ;
        int dim_2(dim_2) ;
                dim_2:_FillValue = 2147483647 ;
```
source: variables/dataProviderOrigin
longName: "Data Provider Origin"

- name: "MetaData/meterologicalFeature"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this in ObsSpace.yaml. I think this needs to be added too

@PraveenKumar-NOAA
Copy link
Copy Markdown
Contributor Author

@emilyhcliu I have created brightnessTemperature plots for the m9 and m10 channels and added to the following issue -- #51.

Base automatically changed from feature/obs_builder to develop September 18, 2025 21:29
HyundeokChoi-NOAA added a commit that referenced this pull request Apr 7, 2026
…m Praveen.Kumar (#101)

This PR have 2 updates.
1. Enable extraction of redistribution‑restriction metadata by adding
restrictionFlag (RSRD) and restrictionExpiration (EXPRSRD) to the
BUFR→IODA conversion. This ensures downstream restriction‑filter logic
has the required fields available in MetaData.
2. Add modified IODA-Converters that originally from Praveen in PRs
(#45, #55, #61, #62, #69).

---------

Co-authored-by: hyundeok choi <hyundeok.choi@clogin09.cactus.wcoss2.ncep.noaa.gov>
Co-authored-by: hyundeok choi <hyundeok.choi@clogin04.cactus.wcoss2.ncep.noaa.gov>
Co-authored-by: Cory Martin <cory.r.martin@noaa.gov>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: hyundeok choi <hyundeok.choi@clogin02.cactus.wcoss2.ncep.noaa.gov>
Co-authored-by: hyundeok choi <hyundeok.choi@clogin06.cactus.wcoss2.ncep.noaa.gov>
Co-authored-by: hyundeok choi <hyundeok.choi@clogin08.cactus.wcoss2.ncep.noaa.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants