-
Notifications
You must be signed in to change notification settings - Fork 7
feat: context parallelism #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jaisw7
wants to merge
44
commits into
master
Choose a base branch
from
feat/cp
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
d20c724
feat: context parallelism
jaisw7 694b1a2
fix: CP integration test uses monkeypatch for env-independent default…
jaisw7 f463b6f
fix: add CP environment verification step in workflow
jaisw7 2f925a4
fix: explicitly set DNET_CP_ENABLED=true in .env after make init
jaisw7 cfd46e6
feat: implement actual gRPC ring communication, address Copilot revie…
jaisw7 1f317f3
feat: replace mock ring communicator with real gRPC implementation
jaisw7 b9152e3
fix: update test_cp_ring_comm.py to use real gRPC instead of deleted …
jaisw7 26c0641
feat: add ContextParallelSettings to .env.example generation
jaisw7 0e301b1
fix: use macOS sed syntax (sed -i '') in CI workflow
jaisw7 a95ac19
feat: implement strategy selection based on DNET_CP_ENABLED config
jaisw7 63ec20c
feat: wire up CPAdapter in shard.py based on DNET_CP_ENABLED config
jaisw7 c7e95b6
feat: complete implementation plan compliance
jaisw7 99f7e78
fix: convert cross-proto imports to relative imports in pb2 files
jaisw7 1dac040
fix: filter out manager nodes from shards passed to topology solver
jaisw7 d01ac5d
fix: robustly filter shards in CPTopologySolver
jaisw7 013669a
fix: implement CPAdapter execution loop to prevent deadlock
jaisw7 7ccbba5
refactor: Dynamic CP rank assignment and validation
jaisw7 0cf14a4
fix(cp): register CPRingService on existing gRPC server, add model co…
jaisw7 1dff741
feat(cp): add CP launch script for multi-shard model loading
jaisw7 fad866e
fix(cp): use DnetDeviceProperties from dnet_p2p for typed access
jaisw7 d3433e5
feat(api): add /v1/settings endpoint and typed CP utilities
jaisw7 53eeeb3
fix(scripts): add sys.path hack to resolve local imports
jaisw7 c33aa61
fix(api): remove incorrect validator for boolean logprobs field
jaisw7 448a7b9
test(inference): remove obsolete logprobs validation test
jaisw7 0d81794
fix(scripts): improve error reporting in stress test
jaisw7 72458d4
feat(cp): plumbing for max_position_embeddings override
jaisw7 762a4b2
fix(inference): increase token timeout to 1h for long context tests
jaisw7 81f30b9
test(shard): fix mock object to include max_position_embeddings
jaisw7 2d68dac
fix(test): increase client timeout to 3600s
jaisw7 7f80c2d
fix(shard): slice logits to last token to avoid OOM on long context
jaisw7 5bba4e5
fix(cp): ensure only last CP rank samples to prevent race/OOM
jaisw7 5f6fd78
fix(test): update mocks with CP rank fields to fix regression
jaisw7 1a6bf75
fix(shard): revert CP rank check to resolve hang on small contexts
jaisw7 e0f10b1
feat(cp): implement API multi-rank broadcast and restore CP rank check
jaisw7 4ca9758
fix(cp): send decode tokens only to last rank to avoid empty chunks
jaisw7 7c87c7b
fix(tests): fix 3 failing tests from CP changes
jaisw7 2fc3b25
fix: remove dnet-tui submodule and add to gitignore
jaisw7 1a3389c
fix(cp): broadcast full tokens to all ranks for ring attention
jaisw7 c6feb9b
fix(needle-in-haystack): disable thinking for tests
jaisw7 dc63fab
fix(needle-in-haystack): include instructions to use non-thinking mod…
jaisw7 23bf72f
feat(cp): implement true ring attention with 1/N memory scaling
jaisw7 4fe4ab6
fix(cp): prevent recursion error in wrapper and injection
jaisw7 35980fd
fix: numerical stability in distributed kv-cache
jaisw7 477fbee
cleanup debug statements
jaisw7 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,109 @@ | ||
| name: CP Integration Tests | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
| inputs: | ||
| model_filter: | ||
| description: 'Model filter for tests (e.g. "qwen")' | ||
| required: false | ||
| default: '' | ||
| pull_request: | ||
| paths: | ||
| - 'src/dnet/core/cp/**' | ||
| - 'src/dnet/shard/adapters/context_parallel.py' | ||
| - 'src/dnet/api/strategies/context_parallel.py' | ||
| - 'tests/integration/test_cp_*.py' | ||
|
|
||
| concurrency: | ||
| group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }} | ||
| cancel-in-progress: true | ||
|
|
||
| jobs: | ||
| cp-integration-tests: | ||
| runs-on: mac2.metal | ||
| timeout-minutes: 60 | ||
| env: | ||
| PROJECT_ROOT: ${{ github.workspace }} | ||
| PYTHONPATH: src | ||
| DNET_CP_ENABLED: 'true' | ||
|
|
||
| steps: | ||
| - name: Checkout | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| submodules: recursive | ||
|
|
||
| - name: Setup Environment | ||
| uses: ./.github/actions/setup-env | ||
| with: | ||
| python_version: '3.12' | ||
|
|
||
| - name: Enable CP in .env | ||
| run: | | ||
| # Force DNET_CP_ENABLED=true in .env file (overrides default) | ||
| # Note: macOS sed requires -i '' for in-place edit | ||
| if grep -q "^DNET_CP_ENABLED=" .env 2>/dev/null; then | ||
| sed -i '' 's/^DNET_CP_ENABLED=.*/DNET_CP_ENABLED=true/' .env | ||
| else | ||
| echo "DNET_CP_ENABLED=true" >> .env | ||
| fi | ||
| echo "Updated .env:" | ||
| grep DNET_CP_ .env || echo "No DNET_CP_ settings found" | ||
|
|
||
| - name: Ensure compatible gRPC/protobuf versions | ||
| run: | | ||
| uv pip install --upgrade "grpcio>=1.75.1" "protobuf>=6.31.1" | ||
|
|
||
| - name: Run CP unit tests | ||
| run: | | ||
| uv run pytest tests/subsystems/test_cp_*.py -v --tb=short | ||
|
|
||
| - name: Kill processes on required ports | ||
| run: | | ||
| for port in 8080 8081 58080 58081; do | ||
| lsof -ti:$port | xargs kill -9 2>/dev/null || true | ||
| done | ||
| sleep 2 | ||
|
|
||
| - name: Verify CP environment | ||
| run: | | ||
| echo "DNET_CP_ENABLED=${DNET_CP_ENABLED}" | ||
| if [ "$DNET_CP_ENABLED" != "true" ]; then | ||
| echo "::error::DNET_CP_ENABLED is not set to true" | ||
| exit 1 | ||
| fi | ||
|
|
||
| - name: Start shard server | ||
| uses: ./.github/actions/start-shard | ||
| with: | ||
| http_port: '8081' | ||
| grpc_port: '58081' | ||
|
|
||
| - name: Start API server | ||
| uses: ./.github/actions/start-api | ||
| with: | ||
| http_port: '8080' | ||
| grpc_port: '58080' | ||
|
|
||
| - name: Run integration tests | ||
| run: | | ||
| sleep 10 # Wait for servers to initialize | ||
| echo "Running tests with DNET_CP_ENABLED=${DNET_CP_ENABLED}" | ||
| if [ -n "${{ github.event.inputs.model_filter }}" ]; then | ||
| uv run pytest tests/integration/test_model_catalog.py -v -x -k "${{ github.event.inputs.model_filter }}" --tb=short | ||
| else | ||
| uv run pytest tests/integration/test_model_catalog.py -v -x --tb=short | ||
| fi | ||
|
|
||
| - name: Cleanup servers | ||
| if: always() | ||
| uses: ./.github/actions/cleanup-servers | ||
|
|
||
| - name: Show logs on failure | ||
| if: failure() | ||
| run: | | ||
| echo "=== Shard logs ===" | ||
| cat shard.log 2>/dev/null || echo "(no shard log)" | ||
| echo "" | ||
| echo "=== API logs ===" | ||
| cat api.log 2>/dev/null || echo "(no API log)" | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -47,3 +47,4 @@ repacked_models/* | |
| # Env files | ||
| *.env* | ||
| !.env*.example | ||
| dnet-tui/ | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.