-
Notifications
You must be signed in to change notification settings - Fork 38
Closed
Description
Component
Harnesses
Describe the bug
In v0.4.0 of GuideLLM, --rate can take in a list of values, see example file. The output of guidellm benchmark looks like this:
{
"args": {
"target": "http://infra-llmdbench-inference-gateway.jchen.svc.cluster.local:80/qwen-qwen3-0-6b",
"data": [
"{'prefix_tokens': 2048, 'prefix_count': 32, 'prompt_tokens': 256, 'output_tokens': 256}"
],
"profile": "constant",
"rate": [
2,
5,
8,
10,
12,
15,
20
],
"backend": "openai_http",
"backend_kwargs": null,
"model": "Qwen/Qwen3-0.6B",
"processor": null,
"processor_args": null,
"data_args": [],
"data_samples": -1,
"data_column_mapper": "generative_column_mapper",
"data_request_formatter": "chat_completions",
"data_collator": "generative",
"data_sampler": null,
"data_num_workers": null,
"dataloader_kwargs": null,
"random_seed": 42,
"output_path": "/requests/guidellm_1761780814-setup_inf_sche_kv_yaml-run_prompt_tokens_100REPLACE_COMMAoutput_tokens_1000REPLACE_COMMAprefix_tokens_2048_llm-d-6b-base/results.json",
"output_formats": [
"console",
"json"
],
"sample_requests": 10,
"warmup": null,
"cooldown": null,
"prefer_response_metrics": true,
"max_seconds": 50,
"max_requests": null,
"max_errors": null,
"max_error_rate": null,
"max_global_error_rate": null
},
"benchmarks": [...]where .benchmarks is a list of performance results for each "stage"/rate. However, right now only one benchmark report is produced from the harness run. We should parse this json and produce multiple report files.
Steps to reproduce
Based on #478.
Additional context or screenshots
Where can we find the schema of GuideLLM output so we can map it to the universal benchmark report?
Metadata
Metadata
Assignees
Labels
No labels