Skip to content

ROCm docs style IA and style guide updates#50

Draft
mattwill-amd wants to merge 2 commits into
mainfrom
rocm-docs-review
Draft

ROCm docs style IA and style guide updates#50
mattwill-amd wants to merge 2 commits into
mainfrom
rocm-docs-review

Conversation

@mattwill-amd

@mattwill-amd mattwill-amd commented Jun 23, 2026

Copy link
Copy Markdown

Edited and restructured docs according to our Style Guide and Diataxis information architecture guidance.

Comment thread docs/how-to/benchmarking/benchmark.md
sinarafati-amd
sinarafati-amd previously approved these changes Jun 23, 2026

@sinarafati-amd sinarafati-amd left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good. left some few comments to be addressed before merging

Comment thread docs/index.rst Outdated
Comment thread docs/what-is-magpie.rst Outdated
Comment thread docs/how-to/benchmarking/benchmark.md
@mattwill-amd mattwill-amd marked this pull request as draft June 29, 2026 12:30
Comment on lines +41 to +43
# Standalone gap analysis on existing traces
python -m Magpie benchmark gap-analysis --trace-dir results/benchmark_vllm_<timestamp>/

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Cursor, there is no gap-analysis subcommand. Magpie/main.py implements standalone gap analysis via --trace-dir on benchmark.

Suggested fix:

python -m Magpie benchmark --trace-dir results/benchmark_vllm_<timestamp>/


Magpie's benchmark mode runs end-to-end performance tests against LLM inference frameworks—vLLM, SGLang, and Atom—and collects throughput and latency metrics in a structured JSON report. Benchmarks can run inside a Docker container, directly on the host, or on a remote Ray cluster, and optionally capture torch profiler traces for deeper analysis with TraceLens and gap analysis. Use this mode to measure inference performance on AMD Instinct™ GPUs and identify the GPU kernels that dominate runtime.

Review these topics for more information:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Review these topics for more information:
For more information, see the following topics:


## Benchmark report

The primary summary file is **`benchmark_report.json`** in the run workspace (see `WorkspaceManager.save_report`). It aggregates throughput, latency, and optional `gap_analysis` / `tracelens_analysis` sections. A typical shape (abbreviated, with `...` marking elided values):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear what "see WorkspaceManager.save_report" is directing the user to?

}
```

## More info

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we need both More Info and Related Sources.

- [Automatic GPU selection in Magpie's benchmark mode](automatic-gpu.md) — how Magpie picks idle GPUs before launching and how to override or disable selection
- [Persistent server reuse (local) in Magpie's benchmark mode](persistent-server-reuse.md) — keep a server alive across runs to avoid model reload overhead
- [Profiling options in Magpie's benchmark mode](profiling-options.md) — configure torch profiler, TraceLens, and gap analysis
- [Analyze and compare kernels with Magpie](../analyze-compare.md) — kernel evaluation modes (orthogonal to Benchmark)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [Analyze and compare kernels with Magpie](../analyze-compare.md) — kernel evaluation modes (orthogonal to Benchmark)
- [Analyze and compare kernels with Magpie](../analyze-compare.md) — kernel evaluation modes independent of benchmark mode

Comment on lines +167 to +172
python -m Magpie benchmark gap-analysis \
--trace-dir results/benchmark_vllm_<timestamp>/torch_trace \
--start-pct 50 --end-pct 80 \
--top-k 15 \
--categories kernel gpu \
--ignore-categories gpu_user_annotation

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a gap-analysis sub-command?


# Magpie troubleshooting

This topic covers errors and debugging techniques. Each section presents symptoms and their solutions in a table so you can quickly find the issue you're seeing. For benchmark configuration problems not listed here, enable verbose logging with `--log-level DEBUG` and check the output before filing a bug report.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The api-reference.md line 29 mentions --verbose / -v for Debug logging, but the later Config settings discusses logging at line 162, but without any levels or options discussed. It is not clear from the api-reference, what the correct approach is for detailed logging for debug purposes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants