Thanks for your interest in contributing to Megatron-Bridge!
You can either follow the steps below to set up the environment from scratch, or use the NeMo Framework container, which provides a pre-built environment and makes these steps unnecessary.
Please see these instructions for installing cuDNN for your target platform. You can check if CUDA toolkit and cuDNN are installed with:
dpkg -l | grep 'cuda-toolkit'
dpkg -l | grep 'cudnn.*cuda'Megatron-Bridge uses uv for package management.
You can configure uv with the following commands:
uv sync --only-group build # Installs build dependencies required by TransformerEngine
uv syncFor containerized development, use our Dockerfile for building your own container. There are three flavors: INFERENCE_FRAMEWORK=inframework, INFERENCE_FRAMEWORK=trtllm and INFERENCE_FRAMEWORK=vllm:
docker build \
-f docker/Dockerfile.ci \
-t megatron-bridge \
.Start your container:
docker run --rm -it -w /workdir -v $(pwd):/workdir \
--entrypoint bash \
--gpus all \
megatron-bridgeWe use pytest for writing both unit and functional tests.
Unit tests aim to test functions in isolation. They generally do not depend on artifacts like Hugging Face checkpoints or larger datasets. Exception to this is a small toy dataset consisting of tokenizers.
Unit tests are stored at tests/unit_tests. Please add your test to an existing folder or create a new one if no one matches.
Functional tests are integration tests that perform model training or operate on larger artifacts. We use pytest for writing these. In some cases, it might be desired to run your test (or parts of it) in a subprocess to avoid process contamination. We use subprocess.Run for this inside the pytest function. Please add your test into one of the predefined folders. If none of the folders matches semantically, please reach out to the @nvidia-nemo/automation in your PR for consultation.
We use uv for managing dependencies. For reproducible builds, our project tracks the generated uv.lock file in the repository.
On a weekly basis, the CI attemps an update of the lock file to test against upstream dependencies.
New required dependencies can be added by uv add $DEPENDENCY.
New optional dependencies can be added by uv add --optional --extra $EXTRA $DEPENDENCY.
EXTRA refers to the subgroup of extra-dependencies to which you're adding the new dependency.
Example: For adding a TRT-LLM specific dependency, run uv add --optional --extra trtllm $DEPENDENCY.
Alternatively, the pyproject.toml file can also be modified directly.
Adding a new dependency will update UV's lock-file. Please check this into your branch:
git add uv.lock pyproject.toml
git commit -m "build: Adding dependencies"
git pushWe use ruff for linting and formatting. CI does not auto-fix linting and formatting issues, but most issues can be fixed by running the following command:
uv run ruff check --fix .
uv run ruff format .Note: If ruff is missing, please follow the installation guide.
Important: All new key features (ex: enabling a new inference optimized library, enabling a new deployment option) must include documentation update (either a new doc or updating an existing one). This document update should:
- Explain the motivation and purpose of the feature
- Outline the technical approach and architecture
- Provide clear usage examples and instructions for users
- Document internal implementation details where appropriate
This ensures that all significant changes are well-thought-out and properly documented for future reference. Comprehensive documentation serves two critical purposes:
- User Adoption: Helps users understand how to effectively use the library's features in their projects
- Developer Extensibility: Enables developers to understand the internal architecture and implementation details, making it easier to modify, extend, or adapt the code for their specific use cases
Quality documentation is essential for both the usability of Megatron-Bridge and its ability to be customized by the community.
- Follow the existing code style and conventions
- Write tests for new features
- Update documentation to reflect your changes
- Ensure all tests pass before submitting a PR
- Do not add arbitrary defaults for configs, be as explicit as possible.
-
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
-
To sign off on a commit you simply use the
--signoff(or-s) option when committing your changes:git commit -s -m "Add cool feature."This will append the following to your commit message:
Signed-off-by: Your Name <[email protected]> -
Full text of the DCO:
Developer Certificate of Origin Version 1.1 Copyright (C) 2004, 2006 The Linux Foundation and its contributors. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved.
There are two ways to trigger CI tests on your pull request:
If your GitHub user is configured to use signed commits, CI tests will run automatically when you push commits to your pull request.
Note: Signed commits are different from signing-off on commits (which uses the
-sflag mentioned in the Signing Your Work section).
If you don't have signed commits set up, you can still trigger CI tests manually by commenting on your pull request:
/ok to test <commit-SHA>
For example:
/ok to test a1b2c3d4e5f6
Important: You'll need to add this comment for each new commit you push to ensure CI tests run on the latest changes.
You can find the commit SHA in several ways:
- View your pull request's commit history on GitHub
- Run
git log --oneline -1in your local repository - Check the commit details in your Git client
Please see our documentation for a detailed guide on contributing new models.