Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
d20c724
feat: context parallelism
jaisw7 Jan 2, 2026
694b1a2
fix: CP integration test uses monkeypatch for env-independent default…
jaisw7 Jan 2, 2026
f463b6f
fix: add CP environment verification step in workflow
jaisw7 Jan 2, 2026
2f925a4
fix: explicitly set DNET_CP_ENABLED=true in .env after make init
jaisw7 Jan 2, 2026
cfd46e6
feat: implement actual gRPC ring communication, address Copilot revie…
jaisw7 Jan 2, 2026
1f317f3
feat: replace mock ring communicator with real gRPC implementation
jaisw7 Jan 2, 2026
b9152e3
fix: update test_cp_ring_comm.py to use real gRPC instead of deleted …
jaisw7 Jan 2, 2026
26c0641
feat: add ContextParallelSettings to .env.example generation
jaisw7 Jan 2, 2026
0e301b1
fix: use macOS sed syntax (sed -i '') in CI workflow
jaisw7 Jan 2, 2026
a95ac19
feat: implement strategy selection based on DNET_CP_ENABLED config
jaisw7 Jan 2, 2026
63ec20c
feat: wire up CPAdapter in shard.py based on DNET_CP_ENABLED config
jaisw7 Jan 2, 2026
c7e95b6
feat: complete implementation plan compliance
jaisw7 Jan 2, 2026
99f7e78
fix: convert cross-proto imports to relative imports in pb2 files
jaisw7 Jan 3, 2026
1dac040
fix: filter out manager nodes from shards passed to topology solver
jaisw7 Jan 3, 2026
d01ac5d
fix: robustly filter shards in CPTopologySolver
jaisw7 Jan 3, 2026
013669a
fix: implement CPAdapter execution loop to prevent deadlock
jaisw7 Jan 3, 2026
7ccbba5
refactor: Dynamic CP rank assignment and validation
jaisw7 Jan 3, 2026
0cf14a4
fix(cp): register CPRingService on existing gRPC server, add model co…
jaisw7 Jan 3, 2026
1dff741
feat(cp): add CP launch script for multi-shard model loading
jaisw7 Jan 3, 2026
fad866e
fix(cp): use DnetDeviceProperties from dnet_p2p for typed access
jaisw7 Jan 3, 2026
d3433e5
feat(api): add /v1/settings endpoint and typed CP utilities
jaisw7 Jan 3, 2026
53eeeb3
fix(scripts): add sys.path hack to resolve local imports
jaisw7 Jan 3, 2026
c33aa61
fix(api): remove incorrect validator for boolean logprobs field
jaisw7 Jan 3, 2026
448a7b9
test(inference): remove obsolete logprobs validation test
jaisw7 Jan 3, 2026
0d81794
fix(scripts): improve error reporting in stress test
jaisw7 Jan 3, 2026
72458d4
feat(cp): plumbing for max_position_embeddings override
jaisw7 Jan 3, 2026
762a4b2
fix(inference): increase token timeout to 1h for long context tests
jaisw7 Jan 3, 2026
81f30b9
test(shard): fix mock object to include max_position_embeddings
jaisw7 Jan 3, 2026
2d68dac
fix(test): increase client timeout to 3600s
jaisw7 Jan 3, 2026
7f80c2d
fix(shard): slice logits to last token to avoid OOM on long context
jaisw7 Jan 3, 2026
5bba4e5
fix(cp): ensure only last CP rank samples to prevent race/OOM
jaisw7 Jan 3, 2026
5f6fd78
fix(test): update mocks with CP rank fields to fix regression
jaisw7 Jan 3, 2026
1a6bf75
fix(shard): revert CP rank check to resolve hang on small contexts
jaisw7 Jan 3, 2026
e0f10b1
feat(cp): implement API multi-rank broadcast and restore CP rank check
jaisw7 Jan 3, 2026
4ca9758
fix(cp): send decode tokens only to last rank to avoid empty chunks
jaisw7 Jan 3, 2026
7c87c7b
fix(tests): fix 3 failing tests from CP changes
jaisw7 Jan 3, 2026
2fc3b25
fix: remove dnet-tui submodule and add to gitignore
jaisw7 Jan 3, 2026
1a3389c
fix(cp): broadcast full tokens to all ranks for ring attention
jaisw7 Jan 3, 2026
c6feb9b
fix(needle-in-haystack): disable thinking for tests
jaisw7 Jan 3, 2026
dc63fab
fix(needle-in-haystack): include instructions to use non-thinking mod…
jaisw7 Jan 3, 2026
23bf72f
feat(cp): implement true ring attention with 1/N memory scaling
jaisw7 Jan 3, 2026
4fe4ab6
fix(cp): prevent recursion error in wrapper and injection
jaisw7 Jan 3, 2026
35980fd
fix: numerical stability in distributed kv-cache
jaisw7 Jan 5, 2026
477fbee
cleanup debug statements
jaisw7 Jan 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,18 @@ DNET_KV_GROUP_SIZE=64
# KV cache TTL in seconds
DNET_KV_TTL_S=30.0

# === Context Parallelism ===
# Enable context parallelism mode
DNET_CP_ENABLED=false
# Ring attention algorithm (auto, pass_kv, pass_q, ring_reduce)
DNET_CP_ALGORITHM=auto
# Minimum context length to enable CP (below this, single-device)
DNET_CP_MIN_CONTEXT_FOR_CP=32768
# Minimum new tokens to prefer pass_kv over pass_q
DNET_CP_MIN_TOKENS_FOR_PASS_KV=256
# Overlap between chunks for sliding window attention
DNET_CP_CHUNK_OVERLAP=0

# === gRPC ===
# Max gRPC message length
DNET_GRPC_MAX_MESSAGE_LENGTH=67108864
Expand Down
109 changes: 109 additions & 0 deletions .github/workflows/cp-integration-tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
name: CP Integration Tests

on:
workflow_dispatch:
inputs:
model_filter:
description: 'Model filter for tests (e.g. "qwen")'
required: false
default: ''
pull_request:
paths:
- 'src/dnet/core/cp/**'
- 'src/dnet/shard/adapters/context_parallel.py'
- 'src/dnet/api/strategies/context_parallel.py'
- 'tests/integration/test_cp_*.py'

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:
cp-integration-tests:
runs-on: mac2.metal
timeout-minutes: 60
env:
PROJECT_ROOT: ${{ github.workspace }}
PYTHONPATH: src
DNET_CP_ENABLED: 'true'

steps:
- name: Checkout
uses: actions/checkout@v4
with:
submodules: recursive

- name: Setup Environment
uses: ./.github/actions/setup-env
with:
python_version: '3.12'

- name: Enable CP in .env
run: |
# Force DNET_CP_ENABLED=true in .env file (overrides default)
# Note: macOS sed requires -i '' for in-place edit
if grep -q "^DNET_CP_ENABLED=" .env 2>/dev/null; then
sed -i '' 's/^DNET_CP_ENABLED=.*/DNET_CP_ENABLED=true/' .env
else
echo "DNET_CP_ENABLED=true" >> .env
fi
echo "Updated .env:"
grep DNET_CP_ .env || echo "No DNET_CP_ settings found"

- name: Ensure compatible gRPC/protobuf versions
run: |
uv pip install --upgrade "grpcio>=1.75.1" "protobuf>=6.31.1"

- name: Run CP unit tests
run: |
uv run pytest tests/subsystems/test_cp_*.py -v --tb=short

- name: Kill processes on required ports
run: |
for port in 8080 8081 58080 58081; do
lsof -ti:$port | xargs kill -9 2>/dev/null || true
done
sleep 2

- name: Verify CP environment
run: |
echo "DNET_CP_ENABLED=${DNET_CP_ENABLED}"
if [ "$DNET_CP_ENABLED" != "true" ]; then
echo "::error::DNET_CP_ENABLED is not set to true"
exit 1
fi

- name: Start shard server
uses: ./.github/actions/start-shard
with:
http_port: '8081'
grpc_port: '58081'

- name: Start API server
uses: ./.github/actions/start-api
with:
http_port: '8080'
grpc_port: '58080'

- name: Run integration tests
run: |
sleep 10 # Wait for servers to initialize
echo "Running tests with DNET_CP_ENABLED=${DNET_CP_ENABLED}"
if [ -n "${{ github.event.inputs.model_filter }}" ]; then
uv run pytest tests/integration/test_model_catalog.py -v -x -k "${{ github.event.inputs.model_filter }}" --tb=short
else
uv run pytest tests/integration/test_model_catalog.py -v -x --tb=short
fi

- name: Cleanup servers
if: always()
uses: ./.github/actions/cleanup-servers

- name: Show logs on failure
if: failure()
run: |
echo "=== Shard logs ==="
cat shard.log 2>/dev/null || echo "(no shard log)"
echo ""
echo "=== API logs ==="
cat api.log 2>/dev/null || echo "(no API log)"
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,4 @@ repacked_models/*
# Env files
*.env*
!.env*.example
dnet-tui/
Loading
Loading