Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub workflows improvements #259

Open
2 of 5 tasks
sarahyurick opened this issue Sep 24, 2024 · 7 comments
Open
2 of 5 tasks

GitHub workflows improvements #259

sarahyurick opened this issue Sep 24, 2024 · 7 comments
Assignees

Comments

@sarahyurick
Copy link
Collaborator

sarahyurick commented Sep 24, 2024

There are a couple of GitHub Actions I want to add to NeMo Curator:

  • GPU CI (already have Add GPU CI/CD #253 open for this)
  • Auto label PRs with the gpuci label if the PR is (1) created by a user with write access and (2) modifying at least 1 Python file
  • Automate Changelog build
  • Automate PyPI releases
  • Display suggested style changes when the pre-commit CI fails (e.g., with Ruff)

After the first one is merged, I can open separate PRs to add the others.

@sarahyurick sarahyurick added the enhancement New feature or request label Sep 24, 2024
@sarahyurick sarahyurick self-assigned this Sep 24, 2024
@sarahyurick
Copy link
Collaborator Author

Note from Oliver about 2:
"
This file didn’t work because using secrets.GITHUB_TOKEN is not permitted to trigger other workflows. GH disallows that to prevent spam. You would need to use your personal access token (PAT). However, since we’re in a public environment, you would need to limit that to a protected branch (via environments). It’s still doable (from branch main, react to all issue change events), but a bit more tricky. As long as the PAT is not exposed to all branches, this is fine.
"

@sarahyurick
Copy link
Collaborator Author

sarahyurick commented Sep 24, 2024

Additional tasks outside of GitHub:

@sarahyurick sarahyurick removed the enhancement New feature or request label Sep 24, 2024
@sarahyurick
Copy link
Collaborator Author

Suggestion from @praateekmahajan: parametrizing the GPU tests with commonly used clients.

  • get_client(cluster_type="gpu", protocol="ucx")
  • get_client(cluster_type="gpu", protocol="tcp")
  • Maybe even set_torch_to_use_rmm/enable_spilling (we shouldn't try all the permutations, just the ones where we expect different behavior)

Depending on how many GPUs the node has we could we even try a multi-GPU setup.

@sarahyurick
Copy link
Collaborator Author

Regarding the gpuCI label:

  • Always have it run (without having to re-add the label) for Nvidians/RAPIDS members
  • For non-verified users we could use /okay to test comments

@sarahyurick
Copy link
Collaborator Author

Another check: unused imports.

@sarahyurick
Copy link
Collaborator Author

Another: #488.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant