Skip to content

anyscale/custom-router-api-benchmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ray Serve Custom Router API Benchmarks

A benchmarking suite for comparing different request router implementations in Ray's LLM serving system. Compares Power-of-2 routers vs Prefix-Aware routers across various replica counts.

Project Structure

custom-router-api-benchmarks/
├── scripts/
│   ├── engine_metrics.py      # Metrics collection and parsing from Ray /metrics endpoint
│   ├── sweep_replicas.py      # Main replica scaling benchmark script
│   └── visualize_replica_sweep.py  # Visualization and analysis of results
└── results/
    ├── replica_sweep/        # Current benchmark results (JSON + raw metrics)
    └── visualizations/       # Generated plots and charts

Usage

Run Benchmark

cd scripts/
python sweep_replicas.py

Generate Visualizations

cd scripts/
python visualize_replica_sweep.py

Requirements

  • Ray cluster, see k8s install steps here
  • Docker image: rayproject/ray-llm:nightly-py311-cu128
  • Ray nightly wheel

Set the following environment variables in an Anyscale Service runtime_env for optimal performance:

  • ANYSCALE_RAY_SERVE_THROUGHPUT_OPT=1
  • RAYLLM_ROUTER_TO_MODEL_REPLICA_RATIO=8

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages