Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions dev/ci/README-CI-Pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

This document describes the GitLab CI/CD pipeline architecture for the global-workflow project, including the different testing modalities, configuration methods, and operational details.

## Related Documentation

- **[Case Configuration Guide](README-case-configuration.md)** - How to add and manage CI test cases
- This document - Pipeline architecture and operational procedures

## Overview

The CI/CD system supports multiple testing approaches across different computing platforms, with flexible triggering mechanisms from both GitHub Actions and GitLab scheduled pipelines. The architecture is designed to be easily extensible to new computing hosts and testing scenarios.
Expand Down Expand Up @@ -179,6 +184,35 @@ PR_NUMBER=1234 # Set via GitHub trigger
- `RUN_ON_MACHINES` (machine selection)
- `CI_PIPELINE_SOURCE` (trigger vs schedule)

## Test Case Configuration

Test cases are configured using a **single source of truth** approach:

1. **Case YAML files** (`dev/ci/cases/pr/*.yaml`) define which hosts should skip each test
2. **Host matrices** in `gitlab-ci-hosts.yml` are automatically generated from case configurations
3. **Validation tests** ensure matrices stay synchronized with case files

### Adding or Modifying Test Cases

See the **[Case Configuration Guide](README-case-configuration.md)** for detailed instructions on:
- Adding new test cases
- Modifying host support for existing cases
- Using the matrix generation script
- Troubleshooting configuration issues

### Quick Reference

```bash
# Generate host matrices from case configurations
python dev/ci/scripts/utils/generate_host_case_matrix.py --update

# Validate matrices are in sync
pytest dev/ci/scripts/unittests/test_ci_matrix_validation.py -v

# Test locally (respects skip_ci_on_hosts automatically)
./dev/workflow/generate_workflows.sh -G /path/to/RUNTESTS
```

## Security and Access Control

### GitHub Actions
Expand Down
211 changes: 211 additions & 0 deletions dev/ci/README-case-configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
# CI Case Configuration Guide

## Overview

The CI test case configuration for the Global Workflow uses a **single source of truth** approach where individual case YAML files define which hosts should skip each test case. The host-specific test matrices in `gitlab-ci-hosts.yml` are automatically generated from these case configurations.

## Single Source of Truth

**Case YAML files** (`dev/ci/cases/pr/*.yaml`) are the authoritative source for test case configurations. Each case file can optionally include a `skip_ci_on_hosts` section listing which computing platforms should NOT run that test case.

### Example Case Configuration

```yaml
experiment:
net: gfs
mode: cycled
pslot: {{ 'pslot' | getenv }}
app: ATM
# ... other experiment settings ...

skip_ci_on_hosts:
- orion
- hercules
- awsepicglobalworkflow

workflow:
engine: rocoto
# ... workflow settings ...
```

In this example, the test case will run on all configured hosts EXCEPT `orion`, `hercules`, and `awsepicglobalworkflow`.

## Adding a New Test Case

To add a new CI test case:

1. **Create the case YAML file** in `dev/ci/cases/pr/`
- Follow the naming convention: `<descriptive_name>.yaml`
- Include all required experiment configuration

2. **Add skip_ci_on_hosts section** (if needed)
- List any hosts that should NOT run this case
- Omit this section if the case should run on all hosts

3. **Regenerate the GitLab host matrices**
```bash
python dev/ci/scripts/utils/generate_host_case_matrix.py --update
```

4. **Verify the changes**
```bash
git diff dev/ci/gitlab-ci-hosts.yml
```

5. **Commit both files**
```bash
git add dev/ci/cases/pr/<your_case>.yaml dev/ci/gitlab-ci-hosts.yml
git commit -m "Add new test case: <your_case>"
```

## Modifying Host Support for Existing Cases

To change which hosts run an existing test case:

1. **Edit the case YAML file** (`dev/ci/cases/pr/<case_name>.yaml`)
- Add or remove hosts from the `skip_ci_on_hosts` list
- Add the section if it doesn't exist
- Remove the section if the case should run on all hosts

2. **Regenerate the GitLab host matrices**
```bash
python dev/ci/scripts/utils/generate_host_case_matrix.py --update
```

3. **Verify and commit changes**
```bash
git diff dev/ci/cases/pr/<case_name>.yaml dev/ci/gitlab-ci-hosts.yml
git add dev/ci/cases/pr/<case_name>.yaml dev/ci/gitlab-ci-hosts.yml
git commit -m "Update host support for <case_name>"
```

## generate_host_case_matrix.py Script

This utility script generates the host case matrices in `gitlab-ci-hosts.yml` from the individual case YAML files.

### Usage

```bash
# View generated matrices (stdout)
python dev/ci/scripts/utils/generate_host_case_matrix.py

# Save to a file
python dev/ci/scripts/utils/generate_host_case_matrix.py --output matrices.yml

# Update gitlab-ci-hosts.yml directly
python dev/ci/scripts/utils/generate_host_case_matrix.py --update

# Dry run to see what would change
python dev/ci/scripts/utils/generate_host_case_matrix.py --update --dry-run

# Generate for specific hosts only
python dev/ci/scripts/utils/generate_host_case_matrix.py --hosts hera orion
```

### How It Works

1. Discovers all test case YAML files in `dev/ci/cases/pr/`
2. Parses the `skip_ci_on_hosts` section from each case
3. For each configured host, builds a list of cases that don't skip that host
4. Generates YAML anchor definitions (`.hostname_cases_matrix: &hostname_cases`)
5. Updates the matrices section in `gitlab-ci-hosts.yml`

### Auto-Detection of Hosts

By default, the script auto-detects which hosts to generate matrices for by reading the existing host definitions in `gitlab-ci-hosts.yml`. This ensures consistency with the GitLab CI pipeline configuration.

## Local Testing with generate_workflows.sh

The `generate_workflows.sh` script automatically respects the `skip_ci_on_hosts` configuration when selecting test cases to run locally. No additional changes are needed - the script uses the `get_host_case_list.py` utility which reads the skip tags directly from case YAML files.

### Example Local Usage

```bash
# Run all GFS cases supported on the current machine
./dev/workflow/generate_workflows.sh -G /path/to/RUNTESTS

# Run specific cases (will skip unsupported ones with a warning)
./dev/workflow/generate_workflows.sh -y "C48_ATM C96_atm3DVar" /path/to/RUNTESTS
```

## CI Validation

The `test_ci_matrix_validation.py` unit test ensures that the matrices in `gitlab-ci-hosts.yml` remain consistent with the `skip_ci_on_hosts` tags in case files. This test will fail if:

- A host's matrix includes a case that has that host in its skip list
- A host's matrix is missing a case that should run on that host

Run the validation test:
```bash
pytest dev/ci/scripts/unittests/test_ci_matrix_validation.py -v
```

## Architecture Benefits

This single source of truth approach provides several advantages:

1. **Consistency**: Both local testing and GitLab CI use the same configuration
2. **Maintainability**: Only one location to update when adding or modifying cases
3. **Validation**: Automated tests ensure matrices stay in sync with case configurations
4. **Clarity**: Easy to see which hosts support each test case
5. **Automation**: Generate matrices automatically with a simple command

## Troubleshooting

### Matrix validation test failing

If `test_ci_matrix_validation.py` fails, it means the matrices in `gitlab-ci-hosts.yml` are out of sync with the case configurations:

```bash
# Regenerate matrices
python dev/ci/scripts/utils/generate_host_case_matrix.py --update

# Re-run validation
pytest dev/ci/scripts/unittests/test_ci_matrix_validation.py -v
```

### Case not running on expected host

1. Check the `skip_ci_on_hosts` section in the case YAML
2. Verify the host name matches exactly (case-sensitive)
3. Ensure matrices were regenerated after modifying the case
4. Check GitLab CI pipeline rules in `gitlab-ci-hosts.yml`

### Script reports "Could not find case matrix section"

The script expects a specific structure in `gitlab-ci-hosts.yml`. Ensure:
- The comment line `# Template matrices for case lists` exists
- Matrix definitions follow immediately after
- The `# Host: ` section markers are present

## Adding a New Host Platform

To add support for a new computing platform:

1. **Add the host to GitLab CI configuration** (`dev/ci/gitlab-ci-hosts.yml`)
- Add build, setup, and run job definitions
- Include the new host in the appropriate sections

2. **Add an empty matrix definition** for the new host:
```yaml
.newhost_cases_matrix: &newhost_cases
- caseName: []
```

3. **Generate matrices** to populate cases for the new host:
```bash
python dev/ci/scripts/utils/generate_host_case_matrix.py --update
```

4. **Update case files** if any should skip the new host:
- Edit relevant case YAMLs to add the new host to their skip lists
- Regenerate matrices again

## Related Files

- `dev/ci/cases/pr/*.yaml` - Individual test case configurations (source of truth)
- `dev/ci/gitlab-ci-hosts.yml` - GitLab CI pipeline with generated host matrices
- `dev/ci/scripts/utils/generate_host_case_matrix.py` - Matrix generation script
- `dev/ci/scripts/utils/get_host_case_list.py` - Utility to get cases for a host
- `dev/ci/scripts/unittests/test_ci_matrix_validation.py` - Validation tests
- `dev/workflow/generate_workflows.sh` - Local experiment setup script
23 changes: 19 additions & 4 deletions dev/ci/gitlab-ci-hosts.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,21 +21,36 @@
# =======================================

# Template matrices for case lists
.hera_cases_matrix: &hera_cases
- caseName: ["C48_ATM", "C48_S2SW", "C48_S2SWA_gefs", "C48mx500_3DVarAOWCDA", "C48mx500_hybAOWCDA", "C96C48_hybatmDA", "C96C48_hybatmsnowDA", "C96C48_hybatmsoilDA", "C96C48_ufs_hybatmDA", "C96C48mx500_S2SW_cyc_gfs", "C96_atm3DVar", "C96_gcafs_cycled", "C96_gcafs_cycled_noDA", "C96mx100_S2S"]
# ==========================================================================
# Host Case Matrices
# ==========================================================================
#
# THIS SECTION IS AUTO-GENERATED by generate_host_case_matrix.py
# DO NOT EDIT MANUALLY - Update skip_ci_on_hosts in case YAML files instead
#
# To regenerate this section:
# python dev/ci/scripts/utils/generate_host_case_matrix.py
#
# Each matrix defines which test cases run on each computing platform.
# Cases are included unless they have the host in their skip_ci_on_hosts list.
#

.gaeac6_cases_matrix: &gaeac6_cases
- caseName: ["C48_ATM", "C48_S2SW", "C48_S2SWA_gefs", "C48mx500_3DVarAOWCDA", "C48mx500_hybAOWCDA", "C96C48_hybatmDA", "C96C48_hybatmsnowDA", "C96C48_hybatmsoilDA", "C96C48mx500_S2SW_cyc_gfs", "C96_atm3DVar", "C96_gcafs_cycled", "C96_gcafs_cycled_noDA", "C96mx100_S2S"]

.orion_cases_matrix: &orion_cases
- caseName: ["C48_ATM", "C48_S2SW", "C48_S2SWA_gefs", "C96C48_hybatmDA", "C96C48mx500_S2SW_cyc_gfs", "C96_atm3DVar", "C96mx100_S2S"]
.hera_cases_matrix: &hera_cases
- caseName: ["C48_ATM", "C48_S2SW", "C48_S2SWA_gefs", "C48mx500_3DVarAOWCDA", "C48mx500_hybAOWCDA", "C96C48_hybatmDA", "C96C48_hybatmsnowDA", "C96C48_hybatmsoilDA", "C96C48_ufs_hybatmDA", "C96C48mx500_S2SW_cyc_gfs", "C96_atm3DVar", "C96_gcafs_cycled", "C96_gcafs_cycled_noDA", "C96mx100_S2S"]

.hercules_cases_matrix: &hercules_cases
- caseName: ["C48_ATM", "C48_S2SW", "C48_S2SWA_gefs", "C48mx500_3DVarAOWCDA", "C48mx500_hybAOWCDA", "C96C48_hybatmDA", "C96C48mx500_S2SW_cyc_gfs", "C96_atm3DVar", "C96mx100_S2S"]

.orion_cases_matrix: &orion_cases
- caseName: ["C48_ATM", "C48_S2SW", "C48_S2SWA_gefs", "C96C48_hybatmDA", "C96C48mx500_S2SW_cyc_gfs", "C96_atm3DVar", "C96mx100_S2S"]

.ursa_cases_matrix: &ursa_cases
- caseName: ["C48_ATM", "C48_S2SW", "C48_S2SWA_gefs", "C48mx500_3DVarAOWCDA", "C48mx500_hybAOWCDA", "C96C48_hybatmDA", "C96C48_hybatmsnowDA", "C96C48_hybatmsoilDA", "C96C48_ufs_hybatmDA", "C96C48mx500_S2SW_cyc_gfs", "C96_atm3DVar", "C96mx100_S2S"]


# Host: Hera - Standard Cases
setup_experiments-hera:
extends: .setup_experiment_template
Expand Down
Loading