Skip to content

chore(qwen3): docs/dead-code cleanup — stale records, deleted-tool references, unused consts#304

Draft
odysa wants to merge 2 commits into
xiaguan:mainfrom
odysa:chore/qwen3-docs-dead-code-cleanup
Draft

chore(qwen3): docs/dead-code cleanup — stale records, deleted-tool references, unused consts#304
odysa wants to merge 2 commits into
xiaguan:mainfrom
odysa:chore/qwen3-docs-dead-code-cleanup

Conversation

@odysa

@odysa odysa commented Jun 8, 2026

Copy link
Copy Markdown

Closes #248.

  • Remove dead code: HIDDEN_SIZE/INTERMEDIATE_SIZE consts, probe_model(), and the now-orphaned ModelInfo
  • Collapse model-crate.md + kernels-crate.md into a slim crate-layout.md; rewrite tp-design.md around the implemented TP runtime; lift the bug: QPS=2 vLLM bench can hang Qwen3-4B serving under KV cache pressure #85 KV admission lessons into lessons/kv-full-lifetime-reservation.md; update index.md and surviving doc cross-references
  • No behavior change. New docs carry no shell commands; the one corrected command (submodule init) was run and verified

Draft pending a CUDA-host clippy/test run (pegainfer-qwen3-4b can't compile locally without nvcc). CPU-reachable crates (pegainfer-engine, pegainfer-vllm-frontend) check clean.

odysa added 2 commits June 7, 2026 19:15
…ferences, unused consts

Closes xiaguan#248.

Dead code:
- drop unused pub consts HIDDEN_SIZE/INTERMEDIATE_SIZE from batch_decode_trace.rs
- drop uncalled probe_model() from pegainfer-qwen3-4b and the now-orphaned
  ModelInfo from pegainfer-engine (server inlines its own detection; qwen35's
  pair was removed in xiaguan#258)

Docs:
- collapse model-crate.md + kernels-crate.md (obsolete crates/ layout, deleted
  qwen3_kernel_snapshot bench) into a slim crate-layout.md describing the crate
  that exists; load-bearing split-K facts and CUPTI/bench gotchas kept
- rewrite tp-design.md around the implemented controller/worker runtime and
  promote the 3 real open items (TP correctness coverage, vocab-parallel
  embedding/lm_head, TP CUDA-graph)
- lift the issue-xiaguan#85 KV admission lessons into
  lessons/kv-full-lifetime-reservation.md and delete kv-pressure-hang.md
- update index.md rows and the qwen3/qwen35 roadmap cleanup ledgers
…es/ paths

Follow-up to the xiaguan#248 deletions — surviving docs no longer point at
deleted files or pre-workspace-refactor paths:

- kernel-op-reports.md: model-crate.md/kernels-crate.md references now
  route to crate-layout.md (with at-the-time naming kept for history),
  crates/ prefixes dropped, and the stale qwen3_kernel_snapshot check
  line rewritten to point at the Step 5 report-bin commands
- deepseek-v4/kernel-paths.md, qwen35/model-crate.md: same rerouting
- deepseek-v4/pplx-ep-integration.md: pplx wrapper path corrected to
  pegainfer-comm/crates/pegainfer-comm-p2p-all-to-all/
- the corrected submodule init command was run and verified locally

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request performs a documentation and dead-code cleanup for the Qwen3-4B model. It consolidates Qwen3-4B crate documentation into a single layout file, updates the tensor parallelism design document to reflect the implemented state, and extracts KV pressure lessons into a dedicated lesson file. Additionally, it removes unused dead code, including the probe_model function, the ModelInfo struct, and unused constants in the Qwen3-4B crate. There are no review comments, so no further feedback is provided.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

qwen3: docs/dead-code cleanup — stale records, deleted-tool references, unused consts

1 participant