Skip to content

[codex] Check UDF skill markdown license headers#15129

Draft
WilliamK112 wants to merge 1 commit into
NVIDIA:mainfrom
WilliamK112:codex/check-skill-license-headers
Draft

[codex] Check UDF skill markdown license headers#15129
WilliamK112 wants to merge 1 commit into
NVIDIA:mainfrom
WilliamK112:codex/check-skill-license-headers

Conversation

@WilliamK112

Copy link
Copy Markdown

Fixes #15112.

Description

This adds skills/udf-*/*.md to the license-header workflow include patterns so changes to UDF skill markdown files are checked for a current-year NVIDIA copyright/license header. The pattern is scoped to UDF skill content and avoids broad *.md coverage, which would include repository docs that do not use skill front matter.

Validation:

  • git diff --check
  • Parsed .github/workflows/license-header-check.yml with Ruby YAML
  • Locally emulated the license-header action glob for all skills/udf-* markdown files
  • Confirmed skills/README.md and skills/docs/dev/VERSIONS.md are not matched by the new pattern

Checklists

Documentation

  • Updated for new or modified user-facing features or behaviors
  • No user-facing change

Testing

  • Added or modified tests to cover new code paths
  • Covered by existing tests
    (Please provide the names of the existing tests in the PR description.)
  • Not required

Performance

  • Tests ran and results are added in the PR description
  • Issue filed with a link in the PR description
  • Not required

Signed-off-by: WilliamK112 <164879897+WilliamK112@users.noreply.github.com>
@greptile-apps

greptile-apps Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds skills/udf-*/*.md to the included_file_patterns list in the license-header workflow so that UDF skill markdown files are checked for a current-year NVIDIA copyright header on every PR, and removes a trailing blank line at the end of the file.

  • The glob skills/udf-*/*.md matches exactly one directory level, leaving markdown files nested in subdirectories (e.g. skills/udf-convert-to-cuda/references/*.md) outside the check — those files currently have headers, but the gap exists for future additions.
  • The seven SKILL.md files encode copyright in YAML front matter (metadata.spdx-file-copyright-text) rather than in an HTML comment block; whether the NVIDIA/spark-rapids-common/license-header-check action recognises this format needs confirmation to avoid CI failures on future PRs that touch those files.

Confidence Score: 4/5

The workflow change is minimal and targeted; the main risk is that SKILL.md files use a non-comment copyright format that may not satisfy the action, which could silently skip enforcement or cause CI noise on future unrelated PRs.

The glob addition is correct for the top-level udf-* markdown files and the trailing-newline cleanup is harmless. The open question is whether the license-header action handles the YAML front-matter copyright format used by SKILL.md files — if it does not, every subsequent PR touching a SKILL.md will fail the check unexpectedly.

.github/workflows/license-header-check.yml — confirm the action accepts YAML front-matter copyright and consider whether the pattern should cover subdirectory markdown files.

Important Files Changed

Filename Overview
.github/workflows/license-header-check.yml Adds skills/udf-*/*.md to the license-header-check include list and removes a trailing blank line; pattern covers only immediate children of udf-* directories, leaving subdirectory markdown files (references/*.md) unmatched.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[PR opened / synchronized] --> B[license-header-check workflow]
    B --> C{File matches included_file_patterns?}
    C -->|*.yml, *.scala, *.py, etc.| D[Check comment-style header]
    C -->|skills/udf-*/*.md NEW| E{File type}
    C -->|No match| F[Skip]
    E -->|CUDF_MICROBENCHMARKS.md / references/*.md| G[HTML comment SPDX header]
    E -->|SKILL.md| H[YAML front matter metadata.spdx-file-copyright-text]
    G --> I[Action validates header]
    H --> J{Action supports YAML front matter?}
    J -->|Yes| I
    J -->|Unknown| K[Potential CI failure]
    I --> L{Header present and current year?}
    L -->|Yes| M[Pass]
    L -->|No| N[Fail]
    C -->|skills/udf-*/references/*.md| F
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[PR opened / synchronized] --> B[license-header-check workflow]
    B --> C{File matches included_file_patterns?}
    C -->|*.yml, *.scala, *.py, etc.| D[Check comment-style header]
    C -->|skills/udf-*/*.md NEW| E{File type}
    C -->|No match| F[Skip]
    E -->|CUDF_MICROBENCHMARKS.md / references/*.md| G[HTML comment SPDX header]
    E -->|SKILL.md| H[YAML front matter metadata.spdx-file-copyright-text]
    G --> I[Action validates header]
    H --> J{Action supports YAML front matter?}
    J -->|Yes| I
    J -->|Unknown| K[Potential CI failure]
    I --> L{Header present and current year?}
    L -->|Yes| M[Pass]
    L -->|No| N[Fail]
    C -->|skills/udf-*/references/*.md| F
Loading

Reviews (1): Last reviewed commit: "Check UDF skill markdown license headers" | Re-trigger Greptile

*.xml,
*.properties,
*.scala,
skills/udf-*/*.md,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Subdirectory markdown files are not matched by this pattern

skills/udf-*/*.md only matches files exactly one level deep inside each udf-* directory — a bare * does not cross path separators. Four existing markdown files (skills/udf-convert-to-cuda/references/JNI_CUDA_GUIDE.md, NATIVE_BUILD_ENV.md, skills/udf-convert-to-cudf/references/RAPIDS_UDF.md, skills/udf-optimize-cudf/references/OPTIMIZATION_PATTERNS.md) all happen to have headers already, so there is no immediate CI failure, but any future markdown file added under a references/ or other nested directory would silently escape the check. Using skills/udf-*/**/*.md would cover the full tree if that is the intent.

*.xml,
*.properties,
*.scala,
skills/udf-*/*.md,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 SKILL.md front-matter copyright may not satisfy the license-header action

The SKILL.md files (e.g. skills/udf-benchmark/SKILL.md) embed copyright only in YAML front matter under metadata.spdx-file-copyright-text, not in a comment block. Every non-SKILL markdown file under skills/udf-* (e.g. CUDF_MICROBENCHMARKS.md) uses an <!-- SPDX-FileCopyrightText: … --> HTML comment. If NVIDIA/spark-rapids-common/license-header-check looks for a comment-style header (as it does for .yml, .scala, .py, etc.), all seven SKILL.md files will fail the check the moment any of them is touched in a PR, potentially blocking unrelated work. Has the action been confirmed to accept the YAML front-matter format as a valid copyright declaration? Has the license-header-check action been confirmed (locally or via its source) to accept the metadata.spdx-file-copyright-text YAML front-matter field in SKILL.md files as a valid copyright header, or does it only recognise comment-block formats?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a license header check for skills

2 participants