Verify GH action tag/SHA combinations#356
Conversation
|
This is in a very early stage and meant to just gather feedback and opinions about the approach. |
|
@raboof do you think it's worth tackling this one? |
|
This looks helpful to me. Does it actually share code with |
It does use the load_actions and a data class. But nothing that presents moving the code to a separate .py file. |
af91fe6 to
185416b
Compare
|
Made some progress on this one. Moved the code to a separate source file and added it to the The four remaining ones are because the The warnings are:
|
6a82461 to
a1796d0
Compare
e8ef2e7 to
832e95b
Compare
|
I've rebased the branch. The changes to I've made the |
|
Thanks for the rebase and scoping this down, @snazy — the approach is solid and the test coverage on the happy paths is nice. I think the check is worth having. Two things I'd want sorted before approving, plus a few smaller items: 🔴 Blockers
🟡 Non-blocking, worth a look
Nothing here is structural — happy to approve once the two blockers are addressed. |
|
Thanks for the review! Pushed another commit to address the comments. |
potiuk
left a comment
There was a problem hiding this comment.
Approving — this closes a real gap: nothing previously checked that the SHA↔tag pairs in actions.yml are genuine, and a forged or typo'd pair would have gone unnoticed. The design fits the gateway conventions well (reuses ActionsYAML/update_actions/update_patterns, extends RefDetails), the permissions are correctly read-only + token-gated, and test coverage is solid. Verified locally that it merges cleanly against main (no conflicts — just needs a rebase to clear the mergeable: UNKNOWN).
A few non-blocking nits I'd like addressed before merge:
pip install ruyaml→uv(check_action_tags.yml): the rest of the repo standardizes onuv run/uvx(seepytest.yml,update.yml); barepipskips the# /// scriptdependency pinning.uvx --with ruyaml(asrun_action_tags.py's own docstring suggests) would match convention. There's also a trailing space on that line that pre-commit may flag.- Rate-limit / transient-error resilience: the check issues several sequential
api.github.comcalls per ref, and a single403/429currently hard-fails the whole run. WithGITHUB_TOKEN's 5000/hr it should fit, but a transient secondary-limit would red the check with no retry/backoff. Not blocking — just flagging thatignore_gh_api_errorsis the only mitigation today. - Minor: the live-API tests in
test_action_tags.pyassert on specific upstream SHAs (e.g.setup-uvv7.1.2), which are brittle if upstream re-tags — worth a comment noting they may need updating.
Thanks for tackling this — it's a genuinely useful guardrail.
e70a4b2 to
0a1365a
Compare
|
The remaining Zizmor CI failure is unrelated to this change, created #918 to address those separately. |
|
@snazy Could you rebase this onto latest |
This change introduces a new function `verify_actions` to validate the contents against GitHub.
TL;DR
The function verifies that the SHAs specified in `actions.yml` exist in the GH repo.
Also ensures that the SHA exists on the Git tag, if the `tag` attribute is specified.
The rest of the (currently spaghetti code) function is a lot of output and error(failure) and warning collection.
Although it issues quite a few GH API requests, the rate limiter should not kick in (with an authenticated GH token).
I opted to rely on the HTTP/1.1 `urllib.request` stuff, which has no connection-reuse. The alternative would have been to add a dependency.
The algorithm roughly works like this, for each action specified in `actions.yml`:
* Issue a warning and stop, if the name is like `OWNER/*` ("wildcard" repository).
Can't verify Git SHAs in this case.
* Issue a warning and stop, if the name is like `docker:*` (not implemented)
* Issue an error and stop, if the name doesn't start with an `OWNER/REPO` pattern.
* Each expired entry is just skipped
* If there is a wildcard reference and a SHA reference, issue an error.
Then, for each reference for an action:
* If no `tag` is specified, let GH resolve the commit SHA.
Emit a warning to add the value of the `tag` attribute, if the SHA can be resolved.
Otherwise, emit an error.
* If `tag` is specified:
* Add the SHA to the set of requested-shas-by-tag
* Call GH's "matching-refs" endpoint for the 'tag' value
* Emit en error, if the object type is not a tag or commit.
* Also resolve 'tag' object types to 'commit' object types.
* Add each returned SHA to the set of valid-shas-by-tag.
* For each "requested tag" verify that the sets of valid and requested shas intersect. If not, emit an error.
1. `matlab-actions/run-tests`: the tag `v3.1.0` has been moved on Apr 14, 2026 (initially added on Apr 12 via apache#695) 2. `dtolnay/rust-toolchain`: `stable` is a branch and cannot be validated as a tag
be89f09 to
9980d7d
Compare
|
@potiuk done |
|
Re-reviewed after the rebase — still LGTM, merging. A few non-blocking follow-up nits for whenever (none worth holding this up):
Thanks for pushing this through — the branch-vs-tag fallback is a nice touch. |
This change introduces a new function
verify_actionsto validate the contents against GitHub.TL;DR
The function verifies that the SHAs specified in
actions.ymlexist in the GH repo. Also ensures that the SHA exists on the Git tag, if thetagattribute is specified. The rest of the function is a lot of output and error(failure) and warning collection.Although it issues quite a few GH API requests, the rate limiter should not kick in (with an authenticated GH token, GH workflows have a limit of 15k requests). I opted to rely on the HTTP/1.1
urllib.requeststuff, which has no connection-reuse. The alternative would have been to add a dependency.The algorithm roughly works like this, for each action specified in
actions.yml:OWNER/*("wildcard" repository). Can't verify Git SHAs in this case.docker:*(not implemented)OWNER/REPOpattern.Then, for each reference for an action:
tagis specified, let GH resolve the commit SHA. Emit a warning to add the value of thetagattribute, if the SHA can be resolved. Otherwise, emit an error.tagis specified:Fixes #110