Some of this is already covered by https://github.com/ROCm/TheRock/blob/main/build_tools/packaging/how_to_do_release.md. We could also make that page more discoverable.
Topics I would find useful:
- What to expect from nightly releases (which jobs run and trigger other jobs)
- What code is used pinned vs unpinned
- What to do in the case of failures (what retries can help with and what they can't)
- Who to ask for help (and what is safe or risky to do without coordinating)
- How to promote a release (see https://github.com/ROCm/TheRock/blob/main/build_tools/packaging/how_to_do_release.md)
- How to run an out-of-band / partial release, such as building/testing/releasing a new PyTorch version for an already released ROCm version
- How to run a patch release
- How to yank a release (nightly or otherwise)
I wrote these docs before for another project, for some ideas: https://iree.dev/developers/general/release-management/
Some of this is already covered by https://github.com/ROCm/TheRock/blob/main/build_tools/packaging/how_to_do_release.md. We could also make that page more discoverable.
Topics I would find useful:
I wrote these docs before for another project, for some ideas: https://iree.dev/developers/general/release-management/