diff --git a/README.md b/README.md index 32466b8..f516f4e 100644 --- a/README.md +++ b/README.md @@ -9,14 +9,68 @@ Radiuss Shared CI allows project to share CI configuration for GitLab. ## Getting Started -This project is meant to be hosted in a Gitlab instance so that other projects -can import files from it to complete their CI configuration. +This project is meant to be hosted in a GitLab instance so that other projects +can use it to configure their CI pipelines. + +**As of v2025.12.0**, RADIUSS Shared CI is available as **GitLab CI Components** (requires GitLab 17.0+). This provides: + +- **Better versioning** through the GitLab CI/CD Catalog +- **Type-safe inputs** with validated parameters +- **Cleaner syntax** using `component:` instead of `include: project:` +- **No more copy-pasting** - all templates are reusable components User documentation is located here: [**RADIUSS Shared CI Docs**](https://radiuss-shared-ci.readthedocs.io/en/latest/). +### Quick Start with Components + +Add components to your `.gitlab-ci.yml`: + +```yaml +# You must define stages (components don't define them to allow customization) +stages: + - prerequisites + - build-and-test + - performance-measurements + +include: + # Base pipeline orchestration + - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/base-pipeline@v2025.12.0 + inputs: + github_project_name: "my-project" + github_project_org: "LLNL" + github_token: $GITHUB_TOKEN + + # Machine-specific pipeline (choose the machines you need) + - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/lassen-pipeline@v2025.12.0 + inputs: + job_cmd: "./scripts/build-and-test.sh" + job_alloc: "1 -W 30" + github_project_name: "my-project" + github_project_org: "LLNL" +``` + +See the [examples/](examples/) directory for complete configuration examples. + +### Available Components + +- **base-pipeline** - Orchestration templates and machine availability checks +- **lassen-pipeline** - Lassen supercomputer (LSF scheduler) +- **dane-pipeline** - Dane supercomputer (SLURM scheduler) +- **matrix-pipeline** - Matrix supercomputer (SLURM scheduler) +- **corona-pipeline** - Corona supercomputer (flux scheduler) +- **tioga-pipeline** - Tioga supercomputer (flux scheduler) +- **tuolumne-pipeline** - Tuolumne supercomputer (flux scheduler) +- **performance-pipeline** - Performance measurement and GitHub reporting +- **utility-draft-pr-filter** - Skip CI on draft pull requests +- **utility-branch-skip** - Skip CI on non-PR branches + +### Legacy Include-Based Approach + +The traditional include-based approach (using `pipelines/*.yml` and `utilities/*.yml`) is still available for GitLab versions < 17.0 or for projects that haven't yet migrated. See the documentation for migration guidance. + ### Installing -This project requires no installation. +This project requires no installation. Components are consumed directly from the GitLab instance. ## Contributing diff --git a/catalog-info.yml b/catalog-info.yml new file mode 100644 index 0000000..ae8c4ea --- /dev/null +++ b/catalog-info.yml @@ -0,0 +1,47 @@ +--- +# GitLab CI/CD Catalog metadata for RADIUSS Shared CI +spec: + inputs: + # This catalog provides reusable components for RADIUSS projects + # See README.md for usage examples + components: + utility-draft-pr-filter: + description: > + Filters draft pull requests to skip CI execution until PR is marked ready. + Requires GitHub integration with GITHUB_TOKEN. + utility-branch-skip: + description: > + Skips CI execution on branches that are not associated with pull requests. + Requires GitHub integration with GITHUB_TOKEN. + base-pipeline: + description: > + Main pipeline setup component that orchestrates machine-specific child pipelines, + handles GitHub status reporting, and manages multi-machine CI execution. + corona-pipeline: + description: > + Corona specific CI pipeline component that runs build and test jobs. + Uses flux scheduler with shared allocation optimization. + tioga-pipeline: + description: > + Tioga specific CI pipeline component that runs build and test jobs. + Uses flux scheduler with shared allocation optimization. + tuolumne-pipeline: + description: > + Tuolumne specific CI pipeline component that runs build and test jobs. + Uses flux scheduler with shared allocation optimization. + dane-pipeline: + description: > + Dane specific CI pipeline component that runs build and test jobs. + Uses SLURM scheduler with shared allocation optimization. + matrix-pipeline: + description: > + Matrix specific CI pipeline component that runs build and test jobs. + Uses SLURM scheduler with shared allocation optimization. + lassen-pipeline: + description: > + Lassen specific CI pipeline component that runs build and test jobs. + Uses LSF scheduler. + performance-pipeline: + description: > + Performance measurement pipeline for running and tracking performance tests + across LLNL supercomputers. diff --git a/customization/gitlab-ci.yml b/customization/gitlab-ci.yml index beac298..4b1f78b 100644 --- a/customization/gitlab-ci.yml +++ b/customization/gitlab-ci.yml @@ -37,12 +37,6 @@ variables: # Tells Gitlab to recursively update the submodules when cloning the project. GIT_SUBMODULE_STRATEGY: recursive -##### PROJECT VARIABLES -# We build the projects in the CI clone directory. -# Used in script/gitlab/build_and_test.sh script. -# TODO: add a clean-up mechanism. - BUILD_ROOT: ${CI_PROJECT_DIR} - ##### SHARED_CI CONFIGURATION # Required information about GitHub repository GITHUB_PROJECT_NAME: "..." diff --git a/docs/sphinx/user_guide/components_migration.rst b/docs/sphinx/user_guide/components_migration.rst new file mode 100644 index 0000000..1e64c3c --- /dev/null +++ b/docs/sphinx/user_guide/components_migration.rst @@ -0,0 +1,380 @@ +Migrating to GitLab CI Components +=================================== + +Starting with v2025.12.0, RADIUSS Shared CI provides GitLab CI Components as the +recommended way to consume shared CI configuration. This guide will help you migrate +from the legacy include-based approach to the new component-based architecture. + +Prerequisites +------------- + +* GitLab 17.0 or later +* Familiarity with your current CI setup +* Access to your project's ``.gitlab-ci.yml`` and ``.gitlab/`` directory + +Benefits of Components +---------------------- + +The component-based approach provides several advantages over the legacy include method: + +**Type-Safe Inputs** + Components validate input parameters, catching configuration errors early. + +**Better Versioning** + Components appear in the GitLab CI/CD Catalog with clear versioning. + +**Cleaner Syntax** + Use ``component:`` with named inputs instead of ``include: project:`` with file paths. + +**No Copy-Pasting** + Templates that previously required copying (like ``customization/gitlab-ci.yml``) + are now reusable components. + +**Improved Discoverability** + Browse available components in the GitLab CI/CD Catalog UI. + +Migration Steps +--------------- + +Step 1: Define Required Stages +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +**IMPORTANT:** The base-pipeline component does NOT define stages to allow you to +add your own custom stages. You must define the required stages in your ``.gitlab-ci.yml``: + +.. code-block:: yaml + + # .gitlab-ci.yml + stages: + - prerequisites # Required for machine checks + - build-and-test # Required for build/test jobs + - performance-measurements # Required if using perf pipeline + # Add your custom stages here: + # - my-custom-stage + +Step 2: Update Main Pipeline File +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +**Before (Legacy approach):** + +.. code-block:: yaml + + # .gitlab-ci.yml + include: + - project: 'lc-templates/id_tokens' + file: 'id_tokens.yml' + - project: 'radiuss/radiuss-shared-ci' + ref: 'v2025.07.0' + file: 'utilities/preliminary-ignore-draft-pr.yml' + - local: '.gitlab/subscribed-pipelines.yml' + + variables: + GITHUB_PROJECT_NAME: "my-project" + GITHUB_PROJECT_ORG: "LLNL" + JOB_CMD: + value: "./scripts/build-and-test.sh" + expand: false + +**After (Component-based):** + +.. code-block:: yaml + + # .gitlab-ci.yml + include: + - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/base-pipeline@v2025.12.0 + inputs: + github_project_name: "my-project" + github_project_org: "LLNL" + github_token: $GITHUB_TOKEN + + - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/utility-draft-pr-filter@v2025.12.0 + inputs: + github_token: $GITHUB_TOKEN + github_project_name: "my-project" + github_project_org: "LLNL" + + - local: '.gitlab/custom-variables.yml' + + variables: + JOB_CMD: + value: "./scripts/build-and-test.sh" + expand: false + +.. note:: + **File Split:** The legacy ``custom-jobs-and-variables.yml`` has been split into two files: + + * ``.gitlab/custom-variables.yml`` - Variables only (included in parent pipeline) + * ``.gitlab/custom-jobs.yml`` - Job templates only (included in child pipelines) + + This prevents duplication and makes it clear where each piece is used. + +Step 3: Update Machine Pipeline Triggers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +**Before (in subscribed-pipelines.yml):** + +.. code-block:: yaml + + dane-build-and-test: + variables: + CI_MACHINE: "dane" + needs: [dane-up-check] + extends: [.build-and-test] + rules: + # Runs except if we explicitly deactivate dane by variable. + - if: '$ON_DANE == "OFF"' + when: never + - when: on_success + +**After (in .gitlab-ci.yml):** + +.. code-block:: yaml + + dane-build-and-test: + variables: + CI_MACHINE: "dane" + needs: [dane-up-check] + extends: [.build-and-test] + rules: + - if: '$ON_DANE == "OFF"' + when: never + - when: on_success + trigger: + include: + - local: '.gitlab/custom-jobs.yml' + - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/dane-pipeline@v2025.12.0 + inputs: + job_cmd: $JOB_CMD + shared_alloc: "--nodes=1 --exclusive --reservation=ci --time=30" + job_alloc: "--nodes=1 --reservation=ci" + github_project_name: $GITHUB_PROJECT_NAME + github_project_org: $GITHUB_PROJECT_ORG + - local: '.gitlab/jobs/dane.yml' + +Step 4: Update Custom Jobs (No Changes Required) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Your existing job definitions in ``.gitlab/jobs/.yml`` continue to work +without modification. They still extend the same templates: + +.. code-block:: yaml + + # .gitlab/jobs/lassen.yml (no changes needed) + gcc-build: + extends: .job_on_lassen + variables: + COMPILER: "gcc" + +Step 5: Split Custom Files +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If you were using the legacy ``custom-jobs-and-variables.yml`` it should be +split into two files, which are both optional: + +**Create** ``.gitlab/custom-jobs.yml`` (job templates only): + +.. code-block:: yaml + + # .gitlab/custom-jobs.yml + .custom_job: + before_script: + - echo "Setting up environment..." + + .custom_perf: + before_script: + - echo "Setting up performance environment..." + +**Create** ``.gitlab/custom-variables.yml`` (variables only): + +.. code-block:: yaml + + # .gitlab/custom-variables.yml + variables: + LASSEN_JOB_ALLOC: "1 -W 30" + DANE_SHARED_ALLOC: "-N 1 -p pdebug -t 30" + # ... etc + +.. note:: + These files have different purposes and are used in different parts of the + pipeline: + * ``custom-variables.yml`` is included in the parent pipeline to define + variables. This is simply a convenience to gather all allocations + information in one place. + * ``custom-jobs.yml`` is included in each child pipeline to populate the + customization templates if needed. Gathering templates in a file avoids + duplication across child pipelines, but the same result could be achieved + by defining them directly in each .gitlab/jobs/.yml file. + +Complete Example +---------------- + +See the ``examples/`` directory in the repository for complete migration examples: + +* ``examples/example-gitlab-ci.yml`` - Complete main CI file using components +* ``examples/example-custom-jobs.yml`` - Optinoal job templates (child pipelines) +* ``examples/example-jobs-lassen.yml`` - Machine-specific jobs + +Component Reference +------------------- + +Base Pipeline Component +^^^^^^^^^^^^^^^^^^^^^^^ + +**Component:** ``base-pipeline`` + +Provides orchestration templates for multi-machine pipelines. + +.. important:: + This component does NOT define stages. You must define the required stages + in your ``.gitlab-ci.yml`` to allow customization: + + .. code-block:: yaml + + stages: + - prerequisites + - build-and-test + - performance-measurements + # Add your custom stages here + +**Inputs:** + +* ``github_project_name`` (required) - GitHub project name +* ``github_project_org`` (required) - GitHub organization +* ``github_token`` (optional) - GitHub API token for status reporting + +**Exported Templates:** + +* ``.machine-check`` - Machine availability verification +* ``.build-and-test`` - Child pipeline trigger template +* ``.custom_job`` - Base template for job customization +* ``.custom_perf`` - Base template for performance jobs + +Machine Pipeline Components +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +**Components:** +``lassen-pipeline``, ``dane-pipeline``, ``matrix-pipeline``, +``corona-pipeline``, ``tioga-pipeline``, ``tuolumne-pipeline`` + +Each provides machine-specific CI templates. + +**Common Inputs:** + +* ``job_cmd`` (required) - Build and test command +* ``github_project_name`` (required) - GitHub project name +* ``github_project_org`` (required) - GitHub organization +* ``llnl_service_user`` (optional) - LLNL service account + +**LSF-specific (Lassen):** + +* ``job_alloc`` - LSF allocation arguments (e.g., "1 -W 30") + +**SLURM/ Flux-specific (Dane, Matrix, Corona, Tioga, Tuolumne):** + +* ``shared_alloc`` - Shared allocation args or "OFF" +* ``job_alloc`` - Per-job allocation args +* ``alloc_name`` - Name for shared allocation + +**Exported Templates:** + +* ``.custom_jobs`` - Customization template for every jobs +* ``.job_on_`` - Main job template for the machine +* ``.on_`` - Machine-specific rules +* ``._reproducer_init`` - Reproducer initialization +* ``._reproducer_vars`` - Reproducer variables (overridable) +* ``._reproducer_job`` - Reproducer job command + +Utility Components +^^^^^^^^^^^^^^^^^^ + +**utility-draft-pr-filter** + +Skips CI on draft pull requests. + +**Inputs:** + +* ``github_token`` (required) +* ``github_project_name`` (required) +* ``github_project_org`` (required) +* ``always_run_pattern`` (optional) - Regex for branches that always run + +**utility-branch-skip** + +Skips CI on branches not associated with a PR. + +**Inputs:** Same as draft-pr-filter + +Performance Pipeline Component +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +**Component:** ``performance-pipeline`` + +Provides templates for performance measurements across machines. + +**Inputs:** + +* ``job_cmd`` - Performance measurement command +* ``_perf_alloc`` (optional) - Per job allocation args, specify for the machine(s) you want to use +* ``perf_processing_cmd`` - Results processing command +* ``perf_artifact_dir`` (optional) - Artifacts directory +* ``perf_results_file`` (optional) - Name of raw results file +* ``perf_processed_file`` (optional) - Name of processed results file +* ``github_token`` (optional) - For GitHub reporting +* ``github_project_name`` (optional) +* ``github_project_org`` (optional) + +**Exported Templates:** + +* ``.custom_perf`` - Customization template for every perf jobs +* ``.perf_on_`` - Performance job for each machine +* ``.results_processing`` - Processing job template +* ``.results_reporting`` - Reporting job template +* ``.convert_to_gh_benchmark`` - GitHub benchmark conversion +* ``.report_to_gh_benchmark`` - GitHub reporting + +Troubleshooting +--------------- + +Component Not Found +^^^^^^^^^^^^^^^^^^^ + +**Error:** ``Component 'radiuss/radiuss-shared-ci/lassen-pipeline' not found`` + +**Solution:** Ensure you're using GitLab 17.0+ and the components have been +published to the CI/CD Catalog on your GitLab instance. + +Input Validation Errors +^^^^^^^^^^^^^^^^^^^^^^^ + +**Error:** ``Required input 'github_project_name' not provided`` + +**Solution:** Check that all required inputs are specified in the ``inputs:`` section +of your component include. + +Template Not Found in Child Pipeline +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +**Error:** ``Job extends template that doesn't exist: .job_on_lassen`` + +**Solution:** Ensure the machine pipeline component is included in the child pipeline's +``trigger: include:`` section, not in the parent pipeline. + +Legacy Compatibility +-------------------- + +The legacy include-based approach (using ``pipelines/*.yml`` and ``utilities/*.yml``) +remains available and supported. Projects can continue using this approach if: + +* Using GitLab version < 17.0 +* Not ready to migrate yet +* Require features not yet available in components + +Both approaches will only be maintained in parallel for this release. + +Getting Help +------------ + +* Check the `examples/` directory for working configurations +* Review component documentation in the GitLab CI/CD Catalog +* Consult the main documentation at https://radiuss-shared-ci.readthedocs.io +* Open an issue at https://github.com/LLNL/radiuss-shared-ci/issues diff --git a/docs/sphinx/user_guide/setup_ci.rst b/docs/sphinx/user_guide/setup_ci.rst index 0fe7ef5..cd26675 100644 --- a/docs/sphinx/user_guide/setup_ci.rst +++ b/docs/sphinx/user_guide/setup_ci.rst @@ -173,8 +173,6 @@ to your project. They are described in the following table: ``LLNL_SERVICE_USER`` Project specific Service User Account used in CI (optional but recommeded) ``CUSTOM_CI_BUILD_DIR`` If not using a service user, where to locate the CI working directories (prevent exceeding your disk quota) ``GIT_SUBMODULES_STRATEGY`` Controls strategy for the clone performed by GitLab. Consider ``recursive`` if you have submodules, otherwise comment it. - ``BUILD_ROOT`` Location (path) where the projects should be built. We provide a sensible default. - ``SHARED_CI_REF`` The reference (branch, tag) you would like to use in RADIUSS Shared CI repository ``GITHUB_PROJECT_NAME`` The Project name on GitHub, used to send status updates ``GITHUB_PROJECT_ORG`` The Project organization on GitHub, used to send status updates ``JOB_CMD`` The command that runs the build and test script. Lets you name and store that script however you like. @@ -236,7 +234,7 @@ details can be found in the file itself. Parameter Description ========================================== ========================================================================================================================== ``ALLOC_NAME`` Name of the shared allocation. Should be unique, our default should be fine. - ``_SHARED_ALLOC`` Parameters for the shared allocation. You may extend the resource and time. + ``_SHARED_ALLOC`` Optional: Parameters for the shared allocation. You may extend the resource and time. ``_JOB_ALLOC`` Parameters for the job allocation. You may extend the resource and time within the scope of the shared allocation. ``PROJECT__VARIANTS`` Global variants to be added to all the shared specs. ``PROJECT__DEPS`` Global dependencies to be added to all the shared specs. diff --git a/examples/example-custom-jobs.yml b/examples/example-custom-jobs.yml new file mode 100644 index 0000000..a1fadfe --- /dev/null +++ b/examples/example-custom-jobs.yml @@ -0,0 +1,55 @@ +############################################################################### +# Copyright (c) 2022-25, Lawrence Livermore National Security, LLC and RADIUSS +# project contributors. See the COPYRIGHT file for details. +# +# SPDX-License-Identifier: (MIT) +############################################################################### + +# Example .gitlab/custom-jobs.yml +# +# This file defines JOB TEMPLATES ONLY. They are meant to override the default +# implementation from the child pipelines. +# This file should be included in CHILD pipelines (via trigger: include:) +# +# Place this file at .gitlab/custom-jobs.yml in your project. + +############################################################################### +# JOB CUSTOMIZATION TEMPLATES +############################################################################### + +# Custom job template - extend this to add project-specific setup +# Note: this is inspired by the .gitlab/custom-jobs.yml from Umpire. +.custom_job: + artifacts: + reports: + junit: junit.xml + name: "${CI_PROJECT_NAME}-${CI_MACHINE}-${CI_JOB_NAME}-${CI_PIPELINE_ID}" + paths: + - ./*.cmake + +# Custom performance job template +.custom_perf: + before_script: + - echo "Setting up performance test environment..." + # Add any performance-specific setup here + +############################################################################### +# REPRODUCER VARIABLES (Optional) +############################################################################### + +# Override reproducer variables for specific machines if needed +# This allows you to add project-specific environment variables +# that should appear in the reproducer output + +# Example from Umpire project: +.reproducer_vars: + script: + - | + echo -e " + # Required variables \n + export MODULE_LIST=\"${MODULE_LIST}\" \n + export SPEC=\"${SPEC//\"/\\\"}\" \n + # Allow to set job script for debugging (only this differs from CI) \n + export DEBUG_MODE=true \n + # Using the CI build cache is optional and requires a token. Set it like so: \n + # export REGISTRY_TOKEN=\"\" \n" diff --git a/examples/example-gitlab-ci.yml b/examples/example-gitlab-ci.yml new file mode 100644 index 0000000..c40d6b5 --- /dev/null +++ b/examples/example-gitlab-ci.yml @@ -0,0 +1,144 @@ +############################################################################### +# Copyright (c) 2022-25, Lawrence Livermore National Security, LLC and RADIUSS +# project contributors. See the COPYRIGHT file for details. +# +# SPDX-License-Identifier: (MIT) +############################################################################### + +# Example .gitlab-ci.yml for consuming projects using GitLab CI Components +# +# This file demonstrates how to use the RADIUSS Shared CI components instead +# of the legacy include-based approach. +# +# MIGRATION: This replaces the customization/gitlab-ci.yml template + +############################################################################### +# STAGES +############################################################################### +# IMPORTANT: You must define stages yourself to allow customization. +# The following stages are REQUIRED by RADIUSS Shared CI components: + +stages: + - prerequisites # Required: machine availability checks + - build-and-test # Required: build and test jobs + - performance-measurements # Required: if using performance pipeline + # Add your custom stages here: + # - custom-stage-1 + # - custom-stage-2 + +############################################################################### +# VARIABLES +############################################################################### + +variables: + # Service user configuration + LLNL_SERVICE_USER: "" # Set to your service user if using one + + # GitHub integration (required for status reporting) + GITHUB_PROJECT_NAME: "my-project" + GITHUB_PROJECT_ORG: "LLNL" + # GITHUB_TOKEN should be set as a CI/CD variable in GitLab settings + + # Build configuration + GIT_SUBMODULE_STRATEGY: recursive + + # Job command - use non-expandable variables for flexibility + JOB_CMD: + value: "./scripts/build-and-test.sh" + expand: false + +############################################################################### +# INCLUDES +############################################################################### + +include: + # Base pipeline templates and utilities + - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/base-pipeline@v2025.12.0 + inputs: + github_project_name: $GITHUB_PROJECT_NAME + github_project_org: $GITHUB_PROJECT_ORG + github_token: $GITHUB_TOKEN + + # Optional: Draft PR filter + - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/utility-draft-pr-filter@v2025.12.0 + inputs: + github_token: $GITHUB_TOKEN + github_project_name: $GITHUB_PROJECT_NAME + github_project_org: $GITHUB_PROJECT_ORG + + # Local variables (used for component inputs and forwarded to child pipelines) + - local: '.gitlab/custom-variables.yml' + +############################################################################### +# MACHINE PIPELINES +############################################################################### + +# Dane (SLURM-based) +dane-up-check: + extends: [.dane, .machine-check] + +dane-build-and-test: + needs: [dane-up-check] + extends: [.dane, .build-and-test] + trigger: + include: + - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/dane-pipeline@v2025.12.0 + inputs: + job_cmd: $JOB_CMD + shared_alloc: "--nodes=1 --exclusive --reservation=ci --time=30" + job_alloc: "--nodes=1 --reservation=ci" + github_project_name: $GITHUB_PROJECT_NAME + github_project_org: $GITHUB_PROJECT_ORG + # OPTIONAL - local: '.gitlab/custom-jobs.yml' + - local: '.gitlab/jobs/dane.yml' + +# Corona (flux-based) +corona-up-check: + extends: [.corona, .machine-check] + +corona-build-and-test: + needs: [corona-up-check] + extends: [.corona, .build-and-test] + trigger: + include: + - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/corona-pipeline@v2025.12.0 + inputs: + job_cmd: $JOB_CMD + shared_alloc: "--nodes=1 --exclusive --time-limit=30m -o per-resource.count=2" + job_alloc: "--nodes=1 --begin-time=+5s" + github_project_name: $GITHUB_PROJECT_NAME + github_project_org: $GITHUB_PROJECT_ORG + # OPTIONAL - local: '.gitlab/custom-jobs.yml' + - local: '.gitlab/jobs/corona.yml' + +# Add similar blocks for: matrix, tioga, tuolumne +# (See full template for all machines) + +############################################################################### +# PERFORMANCE PIPELINE (Optional) +############################################################################### + +performance-measurements: + extends: [.performance-measurements] + rules: + # Add conditions for when to run performance tests + - if: '$CI_COMMIT_BRANCH == "main" || $CI_COMMIT_BRANCH == "develop"' + when: on_success + - when: manual + trigger: + include: + - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/performance-pipeline@v2025.12.0 + inputs: + job_cmd: "./scripts/run-benchmarks.sh" + dane_perf_alloc: "--nodes=1 --exclusive --reservation=ci --time=15" + matrix_perf_alloc: "--nodes=1 --exclusive --partition=pdebug --time=15" + corona_perf_alloc: "--nodes=1 --exclusive --time-limit=15m" + tioga_perf_alloc: "--nodes=1 --exclusive --time-limit=15m" + tuolumne_perf_alloc: "--nodes=1 --exclusive --time-limit=15m" + # No perf alloc information for lassen: the perf jobs will not run on lassen. + github_token: $GITHUB_TOKEN + github_project_name: $GITHUB_PROJECT_NAME + github_project_org: $GITHUB_PROJECT_ORG + # OPTIONAL - local: '.gitlab/custom-jobs.yml' + - local: '.gitlab/custom-jobs.yml' + - local: '.gitlab/jobs/performances.yml' diff --git a/examples/example-jobs-lassen.yml b/examples/example-jobs-lassen.yml new file mode 100644 index 0000000..209d311 --- /dev/null +++ b/examples/example-jobs-lassen.yml @@ -0,0 +1,49 @@ +############################################################################### +# Copyright (c) 2022-25, Lawrence Livermore National Security, LLC and RADIUSS +# project contributors. See the COPYRIGHT file for details. +# +# SPDX-License-Identifier: (MIT) +############################################################################### + +# Example .gitlab/jobs/lassen.yml +# +# This file defines project-specific jobs for the Lassen machine. +# Place this file at .gitlab/jobs/lassen.yml in your project. + +############################################################################### +# BUILD AND TEST JOBS +############################################################################### + +# Example job: GCC build +gcc-build: + extends: .job_on_lassen + variables: + COMPILER: "gcc" + COMPILER_VERSION: "8.3.1" + +# Example job: Clang build +clang-build: + extends: .job_on_lassen + variables: + COMPILER: "clang" + COMPILER_VERSION: "10.0.1" + +# Example job: XL build +xl-build: + extends: .job_on_lassen + variables: + COMPILER: "xl" + COMPILER_VERSION: "16.1.1" + # Mark as advanced job (only runs on main/develop unless ALL_TARGETS=ON) + variables: + ADVANCED_JOB: "ON" + +# Example job: CUDA build +cuda-build: + extends: .job_on_lassen + stage: jobs-stage-2 # Run in second stage + variables: + COMPILER: "gcc" + COMPILER_VERSION: "8.3.1" + ENABLE_CUDA: "ON" + CUDA_VERSION: "11.2.0" diff --git a/templates/base-pipeline/template.yml b/templates/base-pipeline/template.yml new file mode 100644 index 0000000..b3f2209 --- /dev/null +++ b/templates/base-pipeline/template.yml @@ -0,0 +1,204 @@ +--- +############################################################################## +# Copyright (c) 2022-25, Lawrence Livermore National Security, LLC and RADIUSS +# project contributors. See the COPYRIGHT file for details. +# +# SPDX-License-Identifier: (MIT) +############################################################################## + +# GitLab CI Component: Base Pipeline +# +# This component provides templates and utilities for orchestrating +# multi-machine CI pipelines with child pipeline triggers, machine +# availability checks, and GitHub status reporting. +# +# Usage: +# include: +# - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/base-pipeline@v2025.12.0 +# inputs: +# github_project_name: "my-project" +# github_project_org: "LLNL" + +spec: + inputs: + github_project_name: + type: string + description: "GitHub project name for status reporting" + + github_project_org: + type: string + description: "GitHub organization name" + + github_token: + type: string + description: "GitHub personal access token" + default: "" + +--- +# Sets ID tokens for every job using `default:` +include: + - project: 'lc-templates/id_tokens' + file: 'id_tokens.yml' + +variables: + GITHUB_PROJECT_NAME: "$[[ inputs.github_project_name ]]" + GITHUB_PROJECT_ORG: "$[[ inputs.github_project_org ]]" + GITHUB_TOKEN: "$[[ inputs.github_token ]]" + +############################################################################## +# IMPORTANT: Stage Definition +# +# This component does NOT define stages to allow projects to customize their +# pipeline stages. Your .gitlab-ci.yml MUST include these required stages: +# +# stages: +# - prerequisites # Required for machine availability checks +# - build-and-test # Required for build-and-test jobs +# - performance-measurements # Required if using performance pipeline +# # Add your custom stages here as needed +# +# Jobs in this component reference these stage names. +############################################################################## + +############################################################################## +# TEMPLATES + +# Template to check if a machine is available +# Projects should extend this for each machine +.machine-check: + stage: prerequisites + tags: [shell, oslic] + variables: + GIT_STRATEGY: none + script: + - | + # Check if required variables are set for machine checking + if [ -z "$CI_MACHINE" ]; then + echo -e "\e[31mError: CI_MACHINE variable not set\e[0m" + exit 1 + fi + + # Check if lorenz status file exists + LORENZ_STATUS_FILE="/usr/global/tools/lorenz/data/loginnodeStatus" + if [ ! -f "$LORENZ_STATUS_FILE" ]; then + echo -e "\e[31mError: Lorenz status file not found: $LORENZ_STATUS_FILE\e[0m" + exit 1 + fi + + echo "Checking availability of machine: ${CI_MACHINE}" + + # Query machine status with error handling + NODES_UP=$(jq -r ".[\"${CI_MACHINE}\"].total_nodes_up // \"null\"" "$LORENZ_STATUS_FILE" 2>/dev/null) + JQ_EXIT_CODE=$? + + if [ $JQ_EXIT_CODE -ne 0 ]; then + echo -e "\e[31mError: Failed to parse lorenz status file\e[0m" + echo "File content sample:" + head -n 5 "$LORENZ_STATUS_FILE" 2>/dev/null || echo "Cannot read file" + exit 1 + fi + + if [ "$NODES_UP" = "null" ]; then + echo -e "\e[31mError: Machine ${CI_MACHINE} not found in status file\e[0m" + echo "Available machines in status file:" + jq -r 'keys[]' "$LORENZ_STATUS_FILE" 2>/dev/null | head -10 || echo "Cannot list machines" + exit 1 + fi + + # Check if machine is available - report to GitHub + if [ "$NODES_UP" -eq 0 ]; then + echo -e "\e[31mMachine ${CI_MACHINE} is down (${NODES_UP} nodes up)\e[0m" + + # Report machine unavailability to GitHub + if [ -n "$[[ inputs.github_token ]]" ] && [ -n "$[[ inputs.github_project_org ]]" ] && [ -n "$[[ inputs.github_project_name ]]" ]; then + STATUS_RESPONSE=$(mktemp) + STATUS_HTTP_CODE=$(curl --retry 3 --retry-delay 5 --max-time 30 \ + --url "https://api.github.com/repos/$[[ inputs.github_project_org ]]/$[[ inputs.github_project_name ]]/statuses/${CI_COMMIT_SHA}" \ + --header 'Content-Type: application/json' \ + --header "authorization: Bearer $[[ inputs.github_token ]]" \ + --data "{ \"state\": \"failure\", \"target_url\": \"${CI_PIPELINE_URL}\", \"description\": \"GitLab: ${CI_MACHINE} down (0 nodes)\", \"context\": \"ci/gitlab/${CI_MACHINE}\" }" \ + --output "$STATUS_RESPONSE" \ + --write-out "%{http_code}" \ + --silent \ + --show-error) + + echo "GitHub Status API response code: $STATUS_HTTP_CODE" + + if [ "$STATUS_HTTP_CODE" -eq 201 ]; then + echo "Successfully reported machine down status to GitHub" + else + echo "Warning: Failed to report status to GitHub. HTTP status: $STATUS_HTTP_CODE" + echo "Response body:" + cat "$STATUS_RESPONSE" + fi + + rm -f "$STATUS_RESPONSE" + else + echo "Warning: GitHub reporting variables not set, skipping GitHub status update" + fi + + exit 1 + else + echo -e "\e[32m${CI_MACHINE} is available (${NODES_UP} nodes up)\e[0m" + fi + +# Template for triggering build-and-test child pipelines +# Projects should extend this for each machine +.build-and-test: + stage: build-and-test + trigger: + strategy: depend + forward: + pipeline_variables: true + +# Template for triggering perf measurments child pipelines +.performance-measurements: + stage: performance-measurements + trigger: + strategy: depend + forward: + pipeline_variables: true + +# Templates for each machine in top-level pipeline: +# This allows reducing duplication when defining the machine checks +# and sub-pipeline trigger jobs. +.dane: + variables: + CI_MACHINE: "dane" + rules: + # Runs except if we explicitly deactivate dane by variable. + - if: '$ON_DANE == "OFF"' + when: never + - when: on_success + +.matrix: + variables: + CI_MACHINE: "matrix" + rules: + - if: '$ON_MATRIX == "OFF"' + when: never + - when: on_success + +.corona: + variables: + CI_MACHINE: "corona" + rules: + - if: '$ON_CORONA == "OFF"' + when: never + - when: on_success + +.tioga: + variables: + CI_MACHINE: "tioga" + rules: + - if: '$ON_TIOGA == "OFF"' + when: never + - when: on_success + +.tuolumne: + variables: + CI_MACHINE: "tuolumne" + rules: + - if: '$ON_TUOLUMNE == "OFF"' + when: never + - when: on_success diff --git a/templates/corona-pipeline/template.yml b/templates/corona-pipeline/template.yml new file mode 100644 index 0000000..b24f04b --- /dev/null +++ b/templates/corona-pipeline/template.yml @@ -0,0 +1,179 @@ +--- +############################################################################## +# Copyright (c) 2022-25, Lawrence Livermore National Security, LLC and RADIUSS +# project contributors. See the COPYRIGHT file for details. +# +# SPDX-License-Identifier: (MIT) +############################################################################## + +# GitLab CI Component: Corona Pipeline +# +# This component provides CI pipeline templates for running jobs on the +# Corona supercomputer using the flux scheduler with shared allocation support. +# +# Usage: +# include: +# - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/corona-pipeline@v2025.12.0 +# inputs: +# job_cmd: "./scripts/build-and-test.sh" +# shared_alloc: "-N 1 -t 30m" +# job_alloc: "-N 1 -t 10m" +# github_project_name: "my-project" +# github_project_org: "LLNL" + +spec: + inputs: + job_cmd: + type: string + description: "Command to execute for build and test" + + shared_alloc: + type: string + description: "flux alloc arguments for shared allocation (or OFF to disable)" + default: "" + + job_alloc: + type: string + description: "flux batch arguments for individual jobs" + + github_project_name: + type: string + description: "GitHub project name for status reporting" + + github_project_org: + type: string + description: "GitHub organization name" + + llnl_service_user: + type: string + description: "LLNL service user account for CI" + default: "" + + alloc_name: + type: string + description: "Name for shared resource allocation" + default: "ALLOC_${CI_PIPELINE_ID}" + +--- +# Sets ID tokens for every job using `default:` +include: + - project: 'lc-templates/id_tokens' + file: 'id_tokens.yml' + +variables: + JOB_CMD: "$[[ inputs.job_cmd | expand_vars ]]" + CORONA_SHARED_ALLOC: "$[[ inputs.shared_alloc ]]" + CORONA_JOB_ALLOC: "$[[ inputs.job_alloc ]]" + GITHUB_PROJECT_NAME: "$[[ inputs.github_project_name ]]" + GITHUB_PROJECT_ORG: "$[[ inputs.github_project_org ]]" + LLNL_SERVICE_USER: "$[[ inputs.llnl_service_user ]]" + ALLOC_NAME: "$[[ inputs.alloc_name ]]" + +stages: + - is-machine-up + - allocate-resources + - jobs-stage-1 + - jobs-stage-2 + - jobs-stage-3 + - release-resources + +############################################################################## +# UTILITIES + +.on_corona: + tags: + - shell + - corona + rules: + # Runs except if we explicitly deactivate corona by variable + - if: '$ON_CORONA == "OFF"' + when: never + # Advanced jobs can only run on main, develop or if ALL_TARGETS is ON + - if: '$ADVANCED_JOB == "ON" && $CI_COMMIT_BRANCH != "main" && $CI_COMMIT_BRANCH != "develop" && $ALL_TARGETS != "ON"' + when: never + # Do not allocate resource if the required variable is set to OFF + - if: '$CI_JOB_NAME =~ /resources/ && $CORONA_SHARED_ALLOC == "OFF"' + when: never + # We should always release resources allocated in the pipeline + - if: '$CI_JOB_NAME =~ /release_resources/' + when: always + # Default: run on success + - when: on_success + +# Custom job template - override this to create project-specific setup +.custom_job: + variables: + TEMPLATE_CANNOT_BE_EMPTY: "true" + +.corona_reproducer_init: + script: + - | + echo -e "\e[7;32m### CI job ${CI_JOB_ID} reproducer on corona (${SYS_TYPE}) \e[0m" + if [[ -n "${LLNL_SERVICE_USER}" ]]; then echo -e "xsu ${LLNL_SERVICE_USER}"; fi + echo -e " + working_dir=\"/usr/workspace/\${USER}/${GITHUB_PROJECT_NAME}/${CI_JOB_ID}-\$(date +%s)\" \n + mkdir -p \${working_dir} && cd \${working_dir} \n + git clone https://github.com/${GITHUB_PROJECT_ORG}/${GITHUB_PROJECT_NAME}.git --single-branch --depth=1 \n + cd ${GITHUB_PROJECT_NAME} \n + git fetch origin --depth=1 ${CI_COMMIT_SHA} \n + git checkout ${CI_COMMIT_SHA} \n + git submodule update --init --recursive \n" + +.corona_reproducer_vars: + script: + - | + echo -e "# Define project specific variables here if any." + #echo -e "export =\"\${}\"" + +.corona_reproducer_job: + script: + - | + echo -e "flux watch \$(flux batch -o output.stdout.type=kvs ${CORONA_JOB_ALLOC} ${JOB_CMD} )" + echo -e "\e[7;32m### End of reproducer\e[0m" + +.corona_job_command: + script: + - ${PROXY} flux watch $( echo -e ${JOB_CMD} ]] | xargs ${PROXY} flux batch -o output.stdout.type=kvs ${CORONA_JOB_ALLOC} ) + +.job_on_corona: + extends: [.custom_job, .on_corona] + stage: jobs-stage-1 + script: + # Allocation information + - | + ALLOC_ID=$(flux jobs --name="${ALLOC_NAME}" -n -o "{id}") + echo -e "[Information]: Shared allocation ID = ${ALLOC_ID}" + PROXY="$( [[ -n "${ALLOC_ID}" ]] && echo "flux proxy ${ALLOC_ID}" || echo "" )" + # Print a reproducer + - !reference [.corona_reproducer_init, script] + - !reference [.corona_reproducer_vars, script] + - !reference [.corona_reproducer_job, script] + # The actual launch command + - !reference [.corona_job_command, script] + +############################################################################## +# RESOURCE MANAGEMENT JOBS + +# In pre-build phase, allocate a node for builds +allocate_resources: + variables: + GIT_STRATEGY: none + extends: .on_corona + stage: allocate-resources + script: + - | + set -x + flux --parent alloc ${CORONA_SHARED_ALLOC} --job-name=${ALLOC_NAME} --bg + +# In post-build phase, deallocate resources +# This runs even on failure using the rule in .on_corona +release_resources: + variables: + GIT_STRATEGY: none + extends: .on_corona + stage: release-resources + script: + - | + set -x + export URI=$(flux jobs -o "{id} {name}" | grep ${ALLOC_NAME} | awk '{print $1}') + ([[ -n "${URI}" ]] && flux cancel ${URI} || exit 0) diff --git a/templates/dane-pipeline/template.yml b/templates/dane-pipeline/template.yml new file mode 100644 index 0000000..46abb28 --- /dev/null +++ b/templates/dane-pipeline/template.yml @@ -0,0 +1,174 @@ +--- +############################################################################## +# Copyright (c) 2022-25, Lawrence Livermore National Security, LLC and RADIUSS +# project contributors. See the COPYRIGHT file for details. +# +# SPDX-License-Identifier: (MIT) +############################################################################## + +# GitLab CI Component: Dane Pipeline +# +# This component provides CI pipeline templates for running jobs on the +# Dane supercomputer using the SLURM scheduler with shared allocation support. +# +# Usage: +# include: +# - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/dane-pipeline@v2025.12.0 +# inputs: +# job_cmd: "./scripts/build-and-test.sh" +# shared_alloc: "-N 1 -p pdebug -t 30" +# job_alloc: "-n 1" +# github_project_name: "my-project" +# github_project_org: "LLNL" + +spec: + inputs: + job_cmd: + type: string + description: "Command to execute for build and test" + + shared_alloc: + type: string + description: "SLURM salloc arguments for shared allocation (or OFF to disable)" + default: "" + + job_alloc: + type: string + description: "SLURM srun arguments for individual jobs" + + github_project_name: + type: string + description: "GitHub project name for status reporting" + + github_project_org: + type: string + description: "GitHub organization name" + + llnl_service_user: + type: string + description: "LLNL service user account for CI" + default: "" + + alloc_name: + type: string + description: "Name for shared resource allocation" + default: "ALLOC_${CI_PIPELINE_ID}" + +--- +# Sets ID tokens for every job using `default:` +include: + - project: 'lc-templates/id_tokens' + file: 'id_tokens.yml' + +variables: + JOB_CMD: "$[[ inputs.job_cmd | expand_vars ]]" + DANE_SHARED_ALLOC: "$[[ inputs.shared_alloc ]]" + DANE_JOB_ALLOC: "$[[ inputs.job_alloc ]]" + GITHUB_PROJECT_NAME: "$[[ inputs.github_project_name ]]" + GITHUB_PROJECT_ORG: "$[[ inputs.github_project_org ]]" + LLNL_SERVICE_USER: "$[[ inputs.llnl_service_user ]]" + ALLOC_NAME: "$[[ inputs.alloc_name ]]" + +stages: + - is-machine-up + - allocate-resources + - jobs-stage-1 + - jobs-stage-2 + - jobs-stage-3 + - release-resources + +############################################################################## +# UTILITIES + +.on_dane: + tags: + - shell + - dane + rules: + # Runs except if we explicitly deactivate dane by variable + - if: '$ON_DANE == "OFF"' + when: never + # Advanced jobs can only run on main, develop or if ALL_TARGETS is ON + - if: '$ADVANCED_JOB == "ON" && $CI_COMMIT_BRANCH != "main" && $CI_COMMIT_BRANCH != "develop" && $ALL_TARGETS != "ON"' + when: never + # Do not allocate resource if the required variable is set to OFF + - if: '$CI_JOB_NAME =~ /resources/ && $DANE_SHARED_ALLOC == "OFF"' + when: never + # We should always release resources allocated in the pipeline + - if: '$CI_JOB_NAME =~ /release_resources/' + when: always + # Default: run on success + - when: on_success + +# Custom job template - override this to create project-specific setup +.custom_job: + variables: + TEMPLATE_CANNOT_BE_EMPTY: "true" + +.dane_reproducer_init: + script: + - | + echo -e "\e[7;32m### CI job ${CI_JOB_ID} reproducer on dane (${SYS_TYPE}) \e[0m" + if [[ -n "${LLNL_SERVICE_USER}" ]]; then echo -e "xsu ${LLNL_SERVICE_USER}"; fi + echo -e " + working_dir=\"/usr/workspace/\${USER}/${GITHUB_PROJECT_NAME}/${CI_JOB_ID}-\$(date +%s)\" \n + mkdir -p \${working_dir} && cd \${working_dir} \n + git clone https://github.com/${GITHUB_PROJECT_ORG}/${GITHUB_PROJECT_NAME}.git --single-branch --depth=1 \n + cd ${GITHUB_PROJECT_NAME} \n + git fetch origin --depth=1 ${CI_COMMIT_SHA} \n + git checkout ${CI_COMMIT_SHA} \n + git submodule update --init --recursive \n" + +.dane_reproducer_vars: + script: + - | + echo -e "# Define project specific variables here if any." + #echo -e "export =\"\${}\"" + +.dane_reproducer_job: + script: + - | + echo -e "srun ${DANE_JOB_ALLOC} ${JOB_CMD}" + echo -e "\e[7;32m### End of reproducer\e[0m" + +.dane_job_command: + script: + - echo -e ${JOB_CMD} | xargs srun $( [[ -n "${JOBID}" ]] && echo "--jobid=${JOBID}" ) ${DANE_JOB_ALLOC} + +.job_on_dane: + extends: [.custom_job, .on_dane] + stage: jobs-stage-1 + script: + # Allocation information + - echo -e "### Allocation name is ${ALLOC_NAME}" + - export JOBID=$(squeue -h --name=${ALLOC_NAME} --format=%A) + - echo -e "### Job ID is ${JOBID}" + # Print a reproducer + - !reference [.dane_reproducer_init, script] + - !reference [.dane_reproducer_vars, script] + - !reference [.dane_reproducer_job, script] + # The actual launch command + - !reference [.dane_job_command, script] + +############################################################################## +# RESOURCE MANAGEMENT JOBS + +# In pre-build phase, allocate a node for builds +allocate_resources: + variables: + GIT_STRATEGY: none + extends: .on_dane + stage: allocate-resources + script: + - salloc ${DANE_SHARED_ALLOC} --no-shell --job-name=${ALLOC_NAME} + +# In post-build phase, deallocate resources +# This runs even on failure using the rule in .on_dane +release_resources: + variables: + GIT_STRATEGY: none + extends: .on_dane + stage: release-resources + script: + - export JOBID=$(squeue -h --name=${ALLOC_NAME} --format=%A) + - ([[ -n "${JOBID}" ]] && scancel ${JOBID} || exit 0) diff --git a/templates/lassen-pipeline/template.yml b/templates/lassen-pipeline/template.yml new file mode 100644 index 0000000..79e0695 --- /dev/null +++ b/templates/lassen-pipeline/template.yml @@ -0,0 +1,129 @@ +--- +############################################################################## +# Copyright (c) 2022-25, Lawrence Livermore National Security, LLC and RADIUSS +# project contributors. See the COPYRIGHT file for details. +# +# SPDX-License-Identifier: (MIT) +############################################################################## + +# GitLab CI Component: Lassen Pipeline +# +# This component provides CI pipeline templates for running jobs on the +# Lassen supercomputer using the LSF scheduler. +# +# Usage: +# include: +# - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/lassen-pipeline@v2025.12.0 +# inputs: +# job_cmd: "./scripts/build-and-test.sh" +# job_alloc: "1 -W 30" +# github_project_name: "my-project" +# github_project_org: "LLNL" + +spec: + inputs: + job_cmd: + type: string + description: "Command to execute for build and test" + + job_alloc: + type: string + description: "LSF lalloc arguments for job resource allocation" + + github_project_name: + type: string + description: "GitHub project name for status reporting" + + github_project_org: + type: string + description: "GitHub organization name" + + llnl_service_user: + type: string + description: "LLNL service user account for CI" + default: "" + +--- +# Sets ID tokens for every job using `default:` +include: + - project: 'lc-templates/id_tokens' + file: 'id_tokens.yml' + +variables: + JOB_CMD: "$[[ inputs.job_cmd | expand_vars ]]" + LASSEN_JOB_ALLOC: "$[[ inputs.job_alloc ]]" + GITHUB_PROJECT_NAME: "$[[ inputs.github_project_name ]]" + GITHUB_PROJECT_ORG: "$[[ inputs.github_project_org ]]" + LLNL_SERVICE_USER: "$[[ inputs.llnl_service_user ]]" + +stages: + - is-machine-up + - jobs-stage-1 + - jobs-stage-2 + - jobs-stage-3 + +############################################################################## +# UTILITIES + +.on_lassen: + tags: + - shell + - lassen + rules: + # Runs except if we explicitly deactivate lassen by variable + - if: '$ON_LASSEN == "OFF"' + when: never + # Advanced jobs can only run on main, develop or if ALL_TARGETS is ON + - if: '$ADVANCED_JOB == "ON" && $CI_COMMIT_BRANCH != "main" && $CI_COMMIT_BRANCH != "develop" && $ALL_TARGETS != "ON"' + when: never + # Default: run on success + - when: on_success + +# Custom job template - override this to create project-specific setup +.custom_job: + variables: + TEMPLATE_CANNOT_BE_EMPTY: "true" + +.lassen_reproducer_init: + script: + - | + echo -e "\e[7;32m### CI job ${CI_JOB_ID} reproducer on lassen (${SYS_TYPE}) \e[0m" + if [[ -n "${LLNL_SERVICE_USER}" ]]; then echo -e "xsu ${LLNL_SERVICE_USER}"; fi + echo -e " + working_dir=\"/usr/workspace/\${USER}/${GITHUB_PROJECT_NAME}/${CI_JOB_ID}-\$(date +%s)\" \n + mkdir -p \${working_dir} && cd \${working_dir} \n + git clone https://github.com/${GITHUB_PROJECT_ORG}/${GITHUB_PROJECT_NAME}.git --single-branch --depth=1 \n + cd ${GITHUB_PROJECT_NAME} \n + git fetch origin --depth=1 ${CI_COMMIT_SHA} \n + git checkout ${CI_COMMIT_SHA} \n + git submodule update --init --recursive \n" + +# Projects can override this to define custom variables in the reproducer +.lassen_reproducer_vars: + script: + - | + echo -e "# Define project specific variables here if any." + #echo -e "export =\"\${}\"" + +.lassen_reproducer_job: + script: + - | + echo -e "lalloc ${LASSEN_JOB_ALLOC} ${JOB_CMD}" + echo -e "\e[7;32m### End of reproducer\e[0m" + +.lassen_job_command: + script: + - echo -e ${JOB_CMD} | xargs lalloc ${LASSEN_JOB_ALLOC} + +# Main job template for Lassen +# Projects extend this template to define their specific jobs +.job_on_lassen: + extends: [.custom_job, .on_lassen] + stage: jobs-stage-1 + script: + # Print a reproducer + - !reference [.lassen_reproducer_init, script] + - !reference [.lassen_reproducer_vars, script] + - !reference [.lassen_reproducer_job, script] + # The actual launch command + - !reference [.lassen_job_command, script] diff --git a/templates/matrix-pipeline/template.yml b/templates/matrix-pipeline/template.yml new file mode 100644 index 0000000..dc6df27 --- /dev/null +++ b/templates/matrix-pipeline/template.yml @@ -0,0 +1,174 @@ +--- +############################################################################## +# Copyright (c) 2022-25, Lawrence Livermore National Security, LLC and RADIUSS +# project contributors. See the COPYRIGHT file for details. +# +# SPDX-License-Identifier: (MIT) +############################################################################## + +# GitLab CI Component: Matrix Pipeline +# +# This component provides CI pipeline templates for running jobs on the +# Matrix supercomputer using the SLURM scheduler with shared allocation support. +# +# Usage: +# include: +# - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/matrix-pipeline@v2025.12.0 +# inputs: +# job_cmd: "./scripts/build-and-test.sh" +# shared_alloc: "-N 1 -p pdebug -t 30" +# job_alloc: "-n 1" +# github_project_name: "my-project" +# github_project_org: "LLNL" + +spec: + inputs: + job_cmd: + type: string + description: "Command to execute for build and test" + + shared_alloc: + type: string + description: "SLURM salloc arguments for shared allocation (or OFF to disable)" + default: "" + + job_alloc: + type: string + description: "SLURM srun arguments for individual jobs" + + github_project_name: + type: string + description: "GitHub project name for status reporting" + + github_project_org: + type: string + description: "GitHub organization name" + + llnl_service_user: + type: string + description: "LLNL service user account for CI" + default: "" + + alloc_name: + type: string + description: "Name for shared resource allocation" + default: "ALLOC_${CI_PIPELINE_ID}" + +--- +# Sets ID tokens for every job using `default:` +include: + - project: 'lc-templates/id_tokens' + file: 'id_tokens.yml' + +variables: + JOB_CMD: "$[[ inputs.job_cmd | expand_vars ]]" + MATRIX_SHARED_ALLOC: "$[[ inputs.shared_alloc ]]" + MATRIX_JOB_ALLOC: "$[[ inputs.job_alloc ]]" + GITHUB_PROJECT_NAME: "$[[ inputs.github_project_name ]]" + GITHUB_PROJECT_ORG: "$[[ inputs.github_project_org ]]" + LLNL_SERVICE_USER: "$[[ inputs.llnl_service_user ]]" + ALLOC_NAME: "$[[ inputs.alloc_name ]]" + +stages: + - is-machine-up + - allocate-resources + - jobs-stage-1 + - jobs-stage-2 + - jobs-stage-3 + - release-resources + +############################################################################## +# UTILITIES + +.on_matrix: + tags: + - shell + - matrix + rules: + # Runs except if we explicitly deactivate matrix by variable + - if: '$ON_MATRIX == "OFF"' + when: never + # Advanced jobs can only run on main, develop or if ALL_TARGETS is ON + - if: '$ADVANCED_JOB == "ON" && $CI_COMMIT_BRANCH != "main" && $CI_COMMIT_BRANCH != "develop" && $ALL_TARGETS != "ON"' + when: never + # Do not allocate resource if the required variable is set to OFF + - if: '$CI_JOB_NAME =~ /resources/ && $MATRIX_SHARED_ALLOC == "OFF"' + when: never + # We should always release resources allocated in the pipeline + - if: '$CI_JOB_NAME =~ /release_resources/' + when: always + # Default: run on success + - when: on_success + +# Custom job template - override this to create project-specific setup +.custom_job: + variables: + TEMPLATE_CANNOT_BE_EMPTY: "true" + +.matrix_reproducer_init: + script: + - | + echo -e "\e[7;32m### CI job ${CI_JOB_ID} reproducer on matrix (${SYS_TYPE}) \e[0m" + if [[ -n "${LLNL_SERVICE_USER}" ]]; then echo -e "xsu ${LLNL_SERVICE_USER}"; fi + echo -e " + working_dir=\"/usr/workspace/\${USER}/${GITHUB_PROJECT_NAME}/${CI_JOB_ID}-\$(date +%s)\" \n + mkdir -p \${working_dir} && cd \${working_dir} \n + git clone https://github.com/${GITHUB_PROJECT_ORG}/${GITHUB_PROJECT_NAME}.git --single-branch --depth=1 \n + cd ${GITHUB_PROJECT_NAME} \n + git fetch origin --depth=1 ${CI_COMMIT_SHA} \n + git checkout ${CI_COMMIT_SHA} \n + git submodule update --init --recursive \n" + +.matrix_reproducer_vars: + script: + - | + echo -e "# Define project specific variables here if any." + #echo -e "export =\"\${}\"" + +.matrix_reproducer_job: + script: + - | + echo -e "srun ${MATRIX_JOB_ALLOC} ${JOB_CMD}" + echo -e "\e[7;32m### End of reproducer\e[0m" + +.matrix_job_command: + script: + - echo -e ${JOB_CMD} | xargs srun $( [[ -n "${JOBID}" ]] && echo "--jobid=${JOBID}" ) ${MATRIX_JOB_ALLOC} + +.job_on_matrix: + extends: [.custom_job, .on_matrix] + stage: jobs-stage-1 + script: + # Allocation information + - echo -e "### Allocation name is ${ALLOC_NAME}" + - export JOBID=$(squeue -h --name=${ALLOC_NAME} --format=%A) + - echo -e "### Job ID is ${JOBID}" + # Print a reproducer + - !reference [.matrix_reproducer_init, script] + - !reference [.matrix_reproducer_vars, script] + - !reference [.matrix_reproducer_job, script] + # The actual launch command + - !reference [.matrix_job_command, script] + +############################################################################## +# RESOURCE MANAGEMENT JOBS + +# In pre-build phase, allocate a node for builds +allocate_resources: + variables: + GIT_STRATEGY: none + extends: .on_matrix + stage: allocate-resources + script: + - salloc ${MATRIX_SHARED_ALLOC} --no-shell --job-name=${ALLOC_NAME} + +# In post-build phase, deallocate resources +# This runs even on failure using the rule in .on_matrix +release_resources: + variables: + GIT_STRATEGY: none + extends: .on_matrix + stage: release-resources + script: + - export JOBID=$(squeue -h --name=${ALLOC_NAME} --format=%A) + - ([[ -n "${JOBID}" ]] && scancel ${JOBID} || exit 0) diff --git a/templates/performance-pipeline/template.yml b/templates/performance-pipeline/template.yml new file mode 100644 index 0000000..86af64a --- /dev/null +++ b/templates/performance-pipeline/template.yml @@ -0,0 +1,338 @@ +--- +############################################################################## +# Copyright (c) 2025, Lawrence Livermore National Security, LLC and RADIUSS +# project contributors. See the COPYRIGHT file for details. +# +# SPDX-License-Identifier: (MIT) +############################################################################## + +# GitLab CI Component: Performance Pipeline +# +# This component provides templates for running performance measurements +# across LLNL supercomputers, processing results, and reporting to GitHub. +# +# Usage: +# include: +# - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/performance-pipeline@v2025.12.0 +# inputs: +# job_cmd: "./scripts/run-benchmarks.sh" +# perf_processing_cmd: "./scripts/convert-to-gh-benchmark.py" +# _perf_alloc: + +spec: + inputs: + job_cmd: + type: string + description: "Command to execute for performance measurements" + + dane_perf_alloc: + type: string + description: "SLURM srun arguments for job allocation on Dane" + default: "" + + matrix_perf_alloc: + type: string + description: "SLURM srun arguments for job allocation on Matrix" + default: "" + + corona_perf_alloc: + type: string + description: "Flux alloc arguments for job allocation on Corona" + default: "" + + tioga_perf_alloc: + type: string + description: "Flux alloc arguments for job allocation on Tioga" + default: "" + + tuolumne_perf_alloc: + type: string + description: "Flux alloc arguments for job allocation on Tuolumne" + default: "" + + lassen_perf_alloc: + type: string + description: "Lalloc arguments for job allocation on Lassen" + default: "" + + perf_processing_cmd: + type: string + description: "Command to process performance results" + default: "" + + perf_artifact_dir: + type: string + description: "Directory for performance artifacts" + default: "performance-results" + + perf_results_file: + type: string + description: "Name of raw results file" + default: "benchmark_results.json" + + perf_processed_file: + type: string + description: "Name of processed results file" + default: "processed_results.json" + + github_token: + type: string + description: "GitHub token for reporting results" + default: "" + + github_project_name: + type: string + description: "GitHub project name" + default: "" + + github_project_org: + type: string + description: "GitHub organization name" + default: "" + +--- +# Sets ID tokens for every job using `default:` +include: + - project: 'lc-templates/id_tokens' + file: 'id_tokens.yml' + +variables: + PERF_ARTIFACT_DIR: "$[[ inputs.perf_artifact_dir ]]" + PERF_RESULTS_FILE: "$[[ inputs.perf_results_file ]]" + PERF_PROCESSED_FILE: "$[[ inputs.perf_processed_file ]]" + JOB_CMD: "$[[ inputs.job_cmd | expand_vars ]]" + PERF_PROCESSING_CMD: "$[[ inputs.perf_processing_cmd ]]" + GITHUB_TOKEN: "$[[ inputs.github_token ]]" + GITHUB_PROJECT_NAME: "$[[ inputs.github_project_name ]]" + GITHUB_PROJECT_ORG: "$[[ inputs.github_project_org ]]" + +stages: + - perf-runs + - perf-processing + - perf-reporting + +############################################################################## +# PERFORMANCE JOB TEMPLATES (machine-specific) + +.perf_job: + artifacts: + paths: + - $PERF_ARTIFACT_DIR/ + +# Custom job template specific to performance pipeline. +# Override this to create project-specific setup. +.custom_perf: + variables: + TEMPLATE_CANNOT_BE_EMPTY: "true" + +.on_dane: + tags: + - shell + - dane + rules: + - if: '$ON_DANE == "OFF"' + when: never + - if: '$[[ inputs.dane_perf_alloc ]] == ""' + when: never + - when: on_success + +.on_matrix: + tags: + - shell + - matrix + rules: + - if: '$ON_MATRIX == "OFF"' + when: never + - if: '$[[ inputs.matrix_perf_alloc ]] == ""' + when: never + - when: on_success + +.on_corona: + tags: + - shell + - corona + rules: + - if: '$ON_CORONA == "OFF"' + when: never + - if: '$[[ inputs.corona_perf_alloc ]] == ""' + when: never + - when: on_success + +.on_tioga: + tags: + - shell + - tioga + rules: + - if: '$ON_TIOGA == "OFF"' + when: never + - if: '$[[ inputs.tioga_perf_alloc ]] == ""' + when: never + - when: on_success + +.on_tuolumne: + tags: + - shell + - tuolumne + rules: + - if: '$ON_TUOLUMNE == "OFF"' + when: never + - if: '$[[ inputs.tuolumne_perf_alloc ]] == ""' + when: never + - when: on_success + +.on_lassen: + tags: + - shell + - lassen + rules: + - if: '$ON_LASSEN == "OFF"' + when: never + - if: '$[[ inputs.lassen_perf_alloc ]] == ""' + when: never + - when: on_success + +.perf_on_dane: + extends: [.perf_job, .on_dane, .custom_perf] + stage: perf-runs + script: + - export PERF_ARTIFACT_DIR=${PERF_ARTIFACT_DIR}/${CI_JOB_NAME} + - mkdir -p ${PERF_ARTIFACT_DIR} + - srun $[[ inputs.dane_perf_alloc ]] ${JOB_CMD} + +.perf_on_matrix: + extends: [.perf_job, .on_matrix, .custom_perf] + stage: perf-runs + script: + - export PERF_ARTIFACT_DIR=${PERF_ARTIFACT_DIR}/${CI_JOB_NAME} + - mkdir -p ${PERF_ARTIFACT_DIR} + - srun $[[ inputs.matrix_perf_alloc ]] ${JOB_CMD} + +.perf_on_corona: + extends: [.perf_job, .on_corona, .custom_perf] + stage: perf-runs + script: + - export PERF_ARTIFACT_DIR=${PERF_ARTIFACT_DIR}/${CI_JOB_NAME} + - mkdir -p ${PERF_ARTIFACT_DIR} + - srun $[[ inputs.corona_perf_alloc ]] ${JOB_CMD} + +.perf_on_tioga: + extends: [.perf_job, .on_tioga, .custom_perf] + stage: perf-runs + script: + - export PERF_ARTIFACT_DIR=${PERF_ARTIFACT_DIR}/${CI_JOB_NAME} + - mkdir -p ${PERF_ARTIFACT_DIR} + - flux run $[[ inputs.tioga_perf_alloc ]] ${JOB_CMD} + +.perf_on_tuolumne: + extends: [.perf_job, .on_tuolumne, .custom_perf] + stage: perf-runs + script: + - export PERF_ARTIFACT_DIR=${PERF_ARTIFACT_DIR}/${CI_JOB_NAME} + - mkdir -p ${PERF_ARTIFACT_DIR} + - flux run $[[ inputs.tuolumne_perf_alloc ]] ${JOB_CMD} + +.perf_on_lassen: + extends: [.perf_job, .on_lassen, .custom_perf] + stage: perf-runs + script: + - export PERF_ARTIFACT_DIR=${PERF_ARTIFACT_DIR}/${CI_JOB_NAME} + - mkdir -p ${PERF_ARTIFACT_DIR} + - lalloc $[[ inputs.lassen_perf_alloc ]] ${JOB_CMD} + +############################################################################## +# PROCESSING AND REPORTING TEMPLATES + +# Minimal templates for performance processing and reporting jobs +.results_processing: + stage: perf-processing + tags: + - shell + - oslic + artifacts: + paths: + - $PERF_ARTIFACT_DIR/ + variables: + GIT_SUBMODULE_STRATEGY: none + +.results_reporting: + stage: perf-reporting + tags: + - shell + - oslic + variables: + GIT_STRATEGY: none + GIT_SUBMODULE_STRATEGY: none + +# Convert to Github Benchmark templates: +# Projects provide their own ${PERF_PROCESSING_CMD} script to convert +# the data to github benchmark format. +.convert_to_gh_benchmark: + extends: .results_processing + script: + - | + for dir in ${PERF_ARTIFACT_DIR}/*/; do + if [ -n "${PERF_PROCESSING_CMD}" ]; then + if [ -d "$dir" ]; then + echo "Processing results in ${dir}" + cd ${dir} + ${PERF_PROCESSING_CMD} "${PERF_RESULTS_FILE}" "${PERF_PROCESSED_FILE}" + cd - + fi + else + echo "[Warning] Unless processing is not required, PERF_PROCESSING_CMD should be defined." + fi + done + +.caliper_to_gh_benchmark: + extends: .convert_to_gh_benchmark + before_script: + - | + python3 -m venv caliper-env + source caliper-env/bin/activate + pip install caliper-reader + +# Report performance results to GitHub +.report_to_gh_benchmark: + extends: .results_reporting + script: + - | + for file in ${PERF_ARTIFACT_DIR}/*/${PERF_PROCESSED_FILE}; do + if [ -f "$file" ]; then + # Check if required variables are set + if [ -z "$GITHUB_TOKEN" ] || [ -z "$GITHUB_PROJECT_ORG" ] || [ -z "$GITHUB_PROJECT_NAME" ]; then + echo "Error: Required GitHub variables not set" + exit 1 + fi + + # Send job name as the benchmark name + BENCHMARK_NAME=$(basename "$(dirname "$file")") + echo "Sending benchmark results to GitHub for ${BENCHMARK_NAME}..." + BENCHMARK_DATA=$(base64 -w 0 "$file") + RESPONSE_BODY=$(mktemp) + HTTP_CODE=$(curl --retry 3 --retry-delay 5 --retry-connrefused --max-time 30 -X POST \ + --url "https://api.github.com/repos/${GITHUB_PROJECT_ORG}/${GITHUB_PROJECT_NAME}/actions/workflows/benchmark.yml/dispatches" \ + --header "Authorization: token $GITHUB_TOKEN" \ + --header "Accept: application/vnd.github.v3+json" \ + --header "Content-Type: application/json" \ + --data "{ \"ref\": \"${CI_COMMIT_REF_NAME}\", \"inputs\": { \"benchmark_name\": \"${BENCHMARK_NAME}\", \"benchmark_data\": \"${BENCHMARK_DATA}\" } }" \ + --output "$RESPONSE_BODY" \ + --write-out "%{http_code}" \ + --silent \ + --show-error + ) + + echo "GitHub API response code: $HTTP_CODE" + if [ "$HTTP_CODE" -eq 204 ]; then + echo "Successfully triggered GitHub workflow for ${BENCHMARK_NAME}" + else + echo "Failed to trigger GitHub workflow for ${BENCHMARK_NAME}. HTTP status: $HTTP_CODE" + echo "Response body:" + cat "$RESPONSE_BODY" + rm -f "$RESPONSE_BODY" + exit 1 + fi + rm -f "$RESPONSE_BODY" + else + echo "$file not found, skipping GitHub API integration." + fi + done diff --git a/templates/tioga-pipeline/template.yml b/templates/tioga-pipeline/template.yml new file mode 100644 index 0000000..18b8d0b --- /dev/null +++ b/templates/tioga-pipeline/template.yml @@ -0,0 +1,179 @@ +--- +############################################################################## +# Copyright (c) 2022-25, Lawrence Livermore National Security, LLC and RADIUSS +# project contributors. See the COPYRIGHT file for details. +# +# SPDX-License-Identifier: (MIT) +############################################################################## + +# GitLab CI Component: Tioga Pipeline +# +# This component provides CI pipeline templates for running jobs on the +# Tioga supercomputer using the flux scheduler with shared allocation support. +# +# Usage: +# include: +# - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/tioga-pipeline@v2025.12.0 +# inputs: +# job_cmd: "./scripts/build-and-test.sh" +# shared_alloc: "-N 1 -t 30m" +# job_alloc: "-N 1 -t 10m" +# github_project_name: "my-project" +# github_project_org: "LLNL" + +spec: + inputs: + job_cmd: + type: string + description: "Command to execute for build and test" + + shared_alloc: + type: string + description: "flux alloc arguments for shared allocation (or OFF to disable)" + default: "" + + job_alloc: + type: string + description: "flux batch arguments for individual jobs" + + github_project_name: + type: string + description: "GitHub project name for status reporting" + + github_project_org: + type: string + description: "GitHub organization name" + + llnl_service_user: + type: string + description: "LLNL service user account for CI" + default: "" + + alloc_name: + type: string + description: "Name for shared resource allocation" + default: "ALLOC_${CI_PIPELINE_ID}" + +--- +# Sets ID tokens for every job using `default:` +include: + - project: 'lc-templates/id_tokens' + file: 'id_tokens.yml' + +variables: + JOB_CMD: "$[[ inputs.job_cmd | expand_vars ]]" + TIOGA_SHARED_ALLOC: "$[[ inputs.shared_alloc ]]" + TIOGA_JOB_ALLOC: "$[[ inputs.job_alloc ]]" + GITHUB_PROJECT_NAME: "$[[ inputs.github_project_name ]]" + GITHUB_PROJECT_ORG: "$[[ inputs.github_project_org ]]" + LLNL_SERVICE_USER: "$[[ inputs.llnl_service_user ]]" + ALLOC_NAME: "$[[ inputs.alloc_name ]]" + +stages: + - is-machine-up + - allocate-resources + - jobs-stage-1 + - jobs-stage-2 + - jobs-stage-3 + - release-resources + +############################################################################## +# UTILITIES + +.on_tioga: + tags: + - shell + - tioga + rules: + # Runs except if we explicitly deactivate tioga by variable + - if: '$ON_TIOGA == "OFF"' + when: never + # Advanced jobs can only run on main, develop or if ALL_TARGETS is ON + - if: '$ADVANCED_JOB == "ON" && $CI_COMMIT_BRANCH != "main" && $CI_COMMIT_BRANCH != "develop" && $ALL_TARGETS != "ON"' + when: never + # Do not allocate resource if the required variable is set to OFF + - if: '$CI_JOB_NAME =~ /resources/ && $TIOGA_SHARED_ALLOC == "OFF"' + when: never + # We should always release resources allocated in the pipeline + - if: '$CI_JOB_NAME =~ /release_resources/' + when: always + # Default: run on success + - when: on_success + +# Custom job template - override this to create project-specific setup +.custom_job: + variables: + TEMPLATE_CANNOT_BE_EMPTY: "true" + +.tioga_reproducer_init: + script: + - | + echo -e "\e[7;32m### CI job ${CI_JOB_ID} reproducer on tioga (${SYS_TYPE}) \e[0m" + if [[ -n "${LLNL_SERVICE_USER}" ]]; then echo -e "xsu ${LLNL_SERVICE_USER}"; fi + echo -e " + working_dir=\"/usr/workspace/\${USER}/${GITHUB_PROJECT_NAME}/${CI_JOB_ID}-\$(date +%s)\" \n + mkdir -p \${working_dir} && cd \${working_dir} \n + git clone https://github.com/${GITHUB_PROJECT_ORG}/${GITHUB_PROJECT_NAME}.git --single-branch --depth=1 \n + cd ${GITHUB_PROJECT_NAME} \n + git fetch origin --depth=1 ${CI_COMMIT_SHA} \n + git checkout ${CI_COMMIT_SHA} \n + git submodule update --init --recursive \n" + +.tioga_reproducer_vars: + script: + - | + echo -e "# Define project specific variables here if any." + #echo -e "export =\"\${}\"" + +.tioga_reproducer_job: + script: + - | + echo -e "flux watch \$(flux batch -o output.stdout.type=kvs ${TIOGA_JOB_ALLOC} ${JOB_CMD})" + echo -e "\e[7;32m### End of reproducer\e[0m" + +.tioga_job_command: + script: + - ${PROXY} flux watch $( echo -e ${JOB_CMD} | xargs ${PROXY} flux batch -o output.stdout.type=kvs ${TIOGA_JOB_ALLOC} ) + +.job_on_tioga: + extends: [.custom_job, .on_tioga] + stage: jobs-stage-1 + script: + # Allocation information + - | + ALLOC_ID=$(flux jobs --name="${ALLOC_NAME}" -n -o "{id}") + echo -e "[Information]: Shared allocation ID = ${ALLOC_ID}" + PROXY="$( [[ -n "${ALLOC_ID}" ]] && echo "flux proxy ${ALLOC_ID}" || echo "" )" + # Print a reproducer + - !reference [.tioga_reproducer_init, script] + - !reference [.tioga_reproducer_vars, script] + - !reference [.tioga_reproducer_job, script] + # The actual launch command + - !reference [.tioga_job_command, script] + +############################################################################## +# RESOURCE MANAGEMENT JOBS + +# In pre-build phase, allocate a node for builds +allocate_resources: + variables: + GIT_STRATEGY: none + extends: .on_tioga + stage: allocate-resources + script: + - | + set -x + flux --parent alloc ${TIOGA_SHARED_ALLOC} --job-name=${ALLOC_NAME} --bg + +# In post-build phase, deallocate resources +# This runs even on failure using the rule in .on_tioga +release_resources: + variables: + GIT_STRATEGY: none + extends: .on_tioga + stage: release-resources + script: + - | + set -x + export URI=$(flux jobs -o "{id} {name}" | grep ${ALLOC_NAME} | awk '{print $1}') + ([[ -n "${URI}" ]] && flux cancel ${URI} || exit 0) diff --git a/templates/tuolumne-pipeline/template.yml b/templates/tuolumne-pipeline/template.yml new file mode 100644 index 0000000..c5625ce --- /dev/null +++ b/templates/tuolumne-pipeline/template.yml @@ -0,0 +1,179 @@ +--- +############################################################################## +# Copyright (c) 2022-25, Lawrence Livermore National Security, LLC and RADIUSS +# project contributors. See the COPYRIGHT file for details. +# +# SPDX-License-Identifier: (MIT) +############################################################################## + +# GitLab CI Component: Tuolumne Pipeline +# +# This component provides CI pipeline templates for running jobs on the +# Tuolumne supercomputer using the flux scheduler with shared allocation support. +# +# Usage: +# include: +# - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/tuolumne-pipeline@v2025.12.0 +# inputs: +# job_cmd: "./scripts/build-and-test.sh" +# shared_alloc: "-N 1 -t 30m" +# job_alloc: "-N 1 -t 10m" +# github_project_name: "my-project" +# github_project_org: "LLNL" + +spec: + inputs: + job_cmd: + type: string + description: "Command to execute for build and test" + + shared_alloc: + type: string + description: "flux alloc arguments for shared allocation (or OFF to disable)" + default: "" + + job_alloc: + type: string + description: "flux batch arguments for individual jobs" + + github_project_name: + type: string + description: "GitHub project name for status reporting" + + github_project_org: + type: string + description: "GitHub organization name" + + llnl_service_user: + type: string + description: "LLNL service user account for CI" + default: "" + + alloc_name: + type: string + description: "Name for shared resource allocation" + default: "ALLOC_${CI_PIPELINE_ID}" + +--- +# Sets ID tokens for every job using `default:` +include: + - project: 'lc-templates/id_tokens' + file: 'id_tokens.yml' + +variables: + JOB_CMD: "$[[ inputs.job_cmd | expand_vars ]]" + TUOLUMNE_SHARED_ALLOC: "$[[ inputs.shared_alloc ]]" + TUOLUMNE_JOB_ALLOC: "$[[ inputs.job_alloc ]]" + GITHUB_PROJECT_NAME: "$[[ inputs.github_project_name ]]" + GITHUB_PROJECT_ORG: "$[[ inputs.github_project_org ]]" + LLNL_SERVICE_USER: "$[[ inputs.llnl_service_user ]]" + ALLOC_NAME: "$[[ inputs.alloc_name ]]" + +stages: + - is-machine-up + - allocate-resources + - jobs-stage-1 + - jobs-stage-2 + - jobs-stage-3 + - release-resources + +############################################################################## +# UTILITIES + +.on_tuolumne: + tags: + - shell + - tuolumne + rules: + # Runs except if we explicitly deactivate tuolumne by variable + - if: '$ON_TUOLUMNE == "OFF"' + when: never + # Advanced jobs can only run on main, develop or if ALL_TARGETS is ON + - if: '$ADVANCED_JOB == "ON" && $CI_COMMIT_BRANCH != "main" && $CI_COMMIT_BRANCH != "develop" && $ALL_TARGETS != "ON"' + when: never + # Do not allocate resource if the required variable is set to OFF + - if: '$CI_JOB_NAME =~ /resources/ && $TUOLUMNE_SHARED_ALLOC == "OFF"' + when: never + # We should always release resources allocated in the pipeline + - if: '$CI_JOB_NAME =~ /release_resources/' + when: always + # Default: run on success + - when: on_success + +# Custom job template - override this to create project-specific setup +.custom_job: + variables: + TEMPLATE_CANNOT_BE_EMPTY: "true" + +.tuolumne_reproducer_init: + script: + - | + echo -e "\e[7;32m### CI job ${CI_JOB_ID} reproducer on tuolumne (${SYS_TYPE}) \e[0m" + if [[ -n "${LLNL_SERVICE_USER}" ]]; then echo -e "xsu ${LLNL_SERVICE_USER}"; fi + echo -e " + working_dir=\"/usr/workspace/\${USER}/${GITHUB_PROJECT_NAME}/${CI_JOB_ID}-\$(date +%s)\" \n + mkdir -p \${working_dir} && cd \${working_dir} \n + git clone https://github.com/${GITHUB_PROJECT_ORG}/${GITHUB_PROJECT_NAME}.git --single-branch --depth=1 \n + cd ${GITHUB_PROJECT_NAME} \n + git fetch origin --depth=1 ${CI_COMMIT_SHA} \n + git checkout ${CI_COMMIT_SHA} \n + git submodule update --init --recursive \n" + +.tuolumne_reproducer_vars: + script: + - | + echo -e "# Define project specific variables here if any." + #echo -e "export =\"\${}\"" + +.tuolumne_reproducer_job: + script: + - | + echo -e "flux watch \$(flux batch -o output.stdout.type=kvs ${TUOLUMNE_JOB_ALLOC} ${JOB_CMD})" + echo -e "\e[7;32m### End of reproducer\e[0m" + +.tuolumne_job_command: + script: + - ${PROXY} flux watch $( echo -e ${JOB_CMD} | xargs ${PROXY} flux batch -o output.stdout.type=kvs ${TUOLUMNE_JOB_ALLOC} ) + +.job_on_tuolumne: + extends: [.custom_job, .on_tuolumne] + stage: jobs-stage-1 + script: + # Allocation information + - | + ALLOC_ID=$(flux jobs --name="${ALLOC_NAME}" -n -o "{id}") + echo -e "[Information]: Shared allocation ID = ${ALLOC_ID}" + PROXY="$( [[ -n "${ALLOC_ID}" ]] && echo "flux proxy ${ALLOC_ID}" || echo "" )" + # Print a reproducer + - !reference [.tuolumne_reproducer_init, script] + - !reference [.tuolumne_reproducer_vars, script] + - !reference [.tuolumne_reproducer_job, script] + # The actual launch command + - !reference [.tuolumne_job_command, script] + +############################################################################## +# RESOURCE MANAGEMENT JOBS + +# In pre-build phase, allocate a node for builds +allocate_resources: + variables: + GIT_STRATEGY: none + extends: .on_tuolumne + stage: allocate-resources + script: + - | + set -x + flux --parent alloc ${TUOLUMNE_SHARED_ALLOC} --job-name=${ALLOC_NAME} --bg + +# In post-build phase, deallocate resources +# This runs even on failure using the rule in .on_tuolumne +release_resources: + variables: + GIT_STRATEGY: none + extends: .on_tuolumne + stage: release-resources + script: + - | + set -x + export URI=$(flux jobs -o "{id} {name}" | grep ${ALLOC_NAME} | awk '{print $1}') + ([[ -n "${URI}" ]] && flux cancel ${URI} || exit 0) diff --git a/templates/utility-branch-skip/template.yml b/templates/utility-branch-skip/template.yml new file mode 100644 index 0000000..deff94d --- /dev/null +++ b/templates/utility-branch-skip/template.yml @@ -0,0 +1,137 @@ +--- +############################################################################### +# Copyright (c) 2022-25, Lawrence Livermore National Security, LLC and RADIUSS +# project contributors. See the COPYRIGHT file for details. +# +# SPDX-License-Identifier: (MIT) +############################################################################### + +# GitLab CI Component: Branch Skip (Not a PR) +# +# This component skips CI execution for branches that are not associated +# with a GitHub pull request, unless the branch matches the always-run pattern. +# +# Usage: +# include: +# - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/utility-branch-skip@v2025.12.0 +# inputs: +# github_token: $GITHUB_TOKEN +# github_project_name: "my-project" +# github_project_org: "LLNL" +# always_run_pattern: "^develop$|^main$|^master$|^v[0-9.]*$" + +spec: + inputs: + github_token: + type: string + description: "GitHub personal access token for API authentication" + + github_project_name: + type: string + description: "GitHub project/repository name" + + github_project_org: + type: string + description: "GitHub organization name" + + always_run_pattern: + type: string + description: "Regex pattern for branches that should always run CI" + default: "^develop$|^main$|^master$|^v[0-9.]*$" + +--- +# Branch PR check job +ignore-branches-not-a-pr: + stage: .pre + tags: [shell, oslic] + variables: + GIT_STRATEGY: none + script: + - | + # If the current branch is not in the always run pattern + echo "" + echo "### Branch PR filter parameters ###" + echo "# CI_COMMIT_BRANCH = ${CI_COMMIT_BRANCH}" + echo "# ALWAYS_RUN_PATTERN = $[[ inputs.always_run_pattern ]]" + echo "# CI_COMMIT_SHA = ${CI_COMMIT_SHA}" + echo "" + description="GitLab: Pull Request vetted for testing" + status="success" + return_code=0 + + # CI_COMMIT_BRANCH is only empty for tags, we always run CI on tags. + if [[ ! "${CI_COMMIT_BRANCH}" =~ $[[ inputs.always_run_pattern ]] && -n "${CI_COMMIT_BRANCH}" ]]; + then + # Check if required variables are set + if [ -z "$[[ inputs.github_token ]]" ] || [ -z "$[[ inputs.github_project_org ]]" ] || [ -z "$[[ inputs.github_project_name ]]" ]; then + echo "Error: Required GitHub variables not set" + exit 1 + fi + + # Query GitHub GraphQL API with error handling + GRAPHQL_RESPONSE=$(mktemp) + GRAPHQL_HTTP_CODE=$(curl --retry 3 --retry-delay 5 --max-time 30 \ + --header "authorization: Bearer $[[ inputs.github_token ]]" \ + --header "Content-Type: application/json" \ + -X POST \ + --data "{ \"query\": \"query { repository(name: \\\"$[[ inputs.github_project_name ]]\\\", owner: \\\"$[[ inputs.github_project_org ]]\\\") { pullRequests(last: 1, headRefName: \\\"${CI_COMMIT_BRANCH}\\\") { nodes { number } } } }\" }" \ + --output "$GRAPHQL_RESPONSE" \ + --write-out "%{http_code}" \ + --silent \ + --show-error \ + https://api.github.com/graphql) + + echo "GitHub GraphQL API response code: $GRAPHQL_HTTP_CODE" + + if [ "$GRAPHQL_HTTP_CODE" -eq 200 ]; then + echo "Successfully queried GitHub API" + # Check if the response contains valid data and if there's a PR + if jq -e '(.data.repository.pullRequests.nodes // []) | length == 0' "$GRAPHQL_RESPONSE" > /dev/null 2>&1; then + echo "Branch ${CI_COMMIT_BRANCH} is not associated with a Pull Request, skipping CI" + description="GitLab: skipped branch not a Pull Request" + status="failure" + return_code=1 + else + echo "Branch ${CI_COMMIT_BRANCH} is associated with a Pull Request, proceeding with CI" + fi + else + echo "Failed to query GitHub GraphQL API. HTTP status: $GRAPHQL_HTTP_CODE" + echo "Response body:" + cat "$GRAPHQL_RESPONSE" + rm -f "$GRAPHQL_RESPONSE" + exit 1 + fi + + rm -f "$GRAPHQL_RESPONSE" + fi + + # Report status to GitHub with error handling + STATUS_RESPONSE=$(mktemp) + STATUS_HTTP_CODE=$(curl --retry 3 --retry-delay 5 --max-time 30 \ + --url "https://api.github.com/repos/$[[ inputs.github_project_org ]]/$[[ inputs.github_project_name ]]/statuses/${CI_COMMIT_SHA}" \ + --header 'Content-Type: application/json' \ + --header "authorization: Bearer $[[ inputs.github_token ]]" \ + --data "{ \"state\": \"${status}\", \"target_url\": \"${CI_PIPELINE_URL}\", \"description\": \"${description}\", \"context\": \"ci/gitlab/skipped-is-not-a-pr\" }" \ + --output "$STATUS_RESPONSE" \ + --write-out "%{http_code}" \ + --silent \ + --show-error) + + echo "GitHub Status API response code: $STATUS_HTTP_CODE" + + if [ "$STATUS_HTTP_CODE" -eq 201 ]; then + echo "Successfully reported status to GitHub" + else + echo "Failed to report status to GitHub. HTTP status: $STATUS_HTTP_CODE" + echo "Response body:" + cat "$STATUS_RESPONSE" + rm -f "$STATUS_RESPONSE" + exit 1 + fi + + rm -f "$STATUS_RESPONSE" + exit ${return_code} + rules: + - if: $CI_PIPELINE_SOURCE == "web" + when: never + - when: on_success diff --git a/templates/utility-draft-pr-filter/template.yml b/templates/utility-draft-pr-filter/template.yml new file mode 100644 index 0000000..c8324c2 --- /dev/null +++ b/templates/utility-draft-pr-filter/template.yml @@ -0,0 +1,142 @@ +--- +############################################################################### +# Copyright (c) 2022-25, Lawrence Livermore National Security, LLC and RADIUSS +# project contributors. See the COPYRIGHT file for details. +# +# SPDX-License-Identifier: (MIT) +############################################################################### + +# GitLab CI Component: Draft PR Filter +# +# This component skips CI execution for draft pull requests on GitHub. +# It queries the GitHub GraphQL API to check if the PR is in draft state +# and fails the pipeline if it is, unless the branch matches the always-run pattern. +# +# Usage: +# include: +# - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/utility-draft-pr-filter@v2025.12.0 +# inputs: +# github_token: $GITHUB_TOKEN +# github_project_name: "my-project" +# github_project_org: "LLNL" +# always_run_pattern: "^develop$|^main$|^master$|^v[0-9.]*$" + +spec: + inputs: + github_token: + type: string + description: "GitHub personal access token for API authentication" + + github_project_name: + type: string + description: "GitHub project/repository name" + + github_project_org: + type: string + description: "GitHub organization name" + + always_run_pattern: + type: string + description: "Regex pattern for branches that should always run CI" + default: "^develop$|^main$|^master$|^v[0-9.]*$" + +--- +# Draft PR filter job +ignore-draft-pr: + stage: .pre + tags: [shell, oslic] + variables: + GIT_STRATEGY: none + script: + - | + # If the current branch is not in the always run pattern + echo "" + echo "### Draft filter parameters ###" + echo "# CI_COMMIT_BRANCH = ${CI_COMMIT_BRANCH}" + echo "# GITHUB_PROJECT_NAME = $[[ inputs.github_project_name ]]" + echo "# GITHUB_PROJECT_ORG = $[[ inputs.github_project_org ]]" + echo "# ALWAYS_RUN_PATTERN = $[[ inputs.always_run_pattern ]]" + echo "# CI_COMMIT_SHA = ${CI_COMMIT_SHA}" + echo "" + description="GitLab: Pull Request vetted for testing" + status="success" + return_code=0 + + # CI_COMMIT_BRANCH is only empty for tags, we always run CI on tags. + if [[ ! "${CI_COMMIT_BRANCH}" =~ $[[ inputs.always_run_pattern ]] && -n "${CI_COMMIT_BRANCH}" ]]; + then + # Check if required variables are set + if [ -z "$[[ inputs.github_token ]]" ] || [ -z "$[[ inputs.github_project_org ]]" ] || [ -z "$[[ inputs.github_project_name ]]" ]; then + echo "Error: Required GitHub variables not set" + exit 1 + fi + + # Query GitHub GraphQL API with error handling + GRAPHQL_RESPONSE=$(mktemp) + GRAPHQL_HTTP_CODE=$(curl --retry 3 --retry-delay 5 --max-time 30 \ + --header "authorization: Bearer $[[ inputs.github_token ]]" \ + --header "Content-Type: application/json" \ + -X POST \ + --data "{ \"query\": \"query { repository(name: \\\"$[[ inputs.github_project_name ]]\\\", owner: \\\"$[[ inputs.github_project_org ]]\\\") { pullRequests(last: 1, headRefName: \\\"${CI_COMMIT_BRANCH}\\\") { nodes { number, isDraft } } } }\" }" \ + --output "$GRAPHQL_RESPONSE" \ + --write-out "%{http_code}" \ + --silent \ + --show-error \ + https://api.github.com/graphql) + + echo "GitHub GraphQL API response code: $GRAPHQL_HTTP_CODE" + + if [ "$GRAPHQL_HTTP_CODE" -eq 200 ]; then + echo "Successfully queried GitHub API" + # Check if the response contains valid data + if ! jq -e '.data.repository.pullRequests.nodes[0]' "$GRAPHQL_RESPONSE" > /dev/null 2>&1; then + echo "Warning: No pull request found for branch ${CI_COMMIT_BRANCH}" + elif jq -e '.data.repository.pullRequests.nodes[0].isDraft == true' "$GRAPHQL_RESPONSE" > /dev/null; then + description="GitLab: skipped draft Pull Request" + status="failure" + return_code=1 + echo "Pull Request is in draft state, skipping CI" + else + echo "Pull Request is not a draft, proceeding with CI" + fi + else + echo "Failed to query GitHub GraphQL API. HTTP status: $GRAPHQL_HTTP_CODE" + echo "Response body:" + cat "$GRAPHQL_RESPONSE" + rm -f "$GRAPHQL_RESPONSE" + exit 1 + fi + + rm -f "$GRAPHQL_RESPONSE" + fi + + # Report status to GitHub with error handling + STATUS_RESPONSE=$(mktemp) + STATUS_HTTP_CODE=$(curl --retry 3 --retry-delay 5 --max-time 30 \ + --url "https://api.github.com/repos/$[[ inputs.github_project_org ]]/$[[ inputs.github_project_name ]]/statuses/${CI_COMMIT_SHA}" \ + --header 'Content-Type: application/json' \ + --header "authorization: Bearer $[[ inputs.github_token ]]" \ + --data "{ \"state\": \"${status}\", \"target_url\": \"${CI_PIPELINE_URL}\", \"description\": \"${description}\", \"context\": \"ci/gitlab/skipped-draft-pr\" }" \ + --output "$STATUS_RESPONSE" \ + --write-out "%{http_code}" \ + --silent \ + --show-error) + + echo "GitHub Status API response code: $STATUS_HTTP_CODE" + + if [ "$STATUS_HTTP_CODE" -eq 201 ]; then + echo "Successfully reported status to GitHub" + else + echo "Failed to report status to GitHub. HTTP status: $STATUS_HTTP_CODE" + echo "Response body:" + cat "$STATUS_RESPONSE" + rm -f "$STATUS_RESPONSE" + exit 1 + fi + + rm -f "$STATUS_RESPONSE" + exit ${return_code} + rules: + - if: $CI_PIPELINE_SOURCE == "web" + when: never + - when: on_success