[Feat.] Refactor llm_inference/run.py to use ParallelInferenceManager with batch inference by yl231 · Pull Request #59 · RouteWorks/RouterArena

yl231 · 2026-01-16T01:46:40Z

Replace sequential processing with parallel batch inference infrastructure
Maintain compatibility with origin/main: same CLI interface and functionality
Add --num-runs parameter support (default: 1) to match batch_inference.py
Use ParallelInferenceManager for efficient parallel processing with workers
Add README_PARALLEL.md documentation

- Replace sequential processing with parallel batch inference infrastructure - Maintain compatibility with origin/main: same CLI interface and functionality - Add --num-runs parameter support (default: 1) to match batch_inference.py - Use ParallelInferenceManager for efficient parallel processing with workers - Fix mypy type annotations: Tuple return types and Optional parameters - Add README_PARALLEL.md documentation

yl231 · 2026-01-16T01:47:27Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a significant and well-executed refactoring of the LLM inference process, replacing the previous sequential logic with a robust parallel batch inference system. The new ParallelInferenceManager and batch_inference.py script are excellent additions that will greatly improve performance and maintainability. The changes are well-structured, and the new documentation is very helpful. My review includes suggestions to enhance cross-platform compatibility, simplify some of the concurrency logic for better clarity, and improve error handling and efficiency in a few areas.

- Replace sequential processing with parallel batch inference infrastructure - Maintain compatibility with origin/main: same CLI interface and functionality - Add --num-runs parameter support (default: 1) to match batch_inference.py - Use ParallelInferenceManager for efficient parallel processing with workers - Fix mypy type annotations: Tuple return types and Optional parameters - Add README_PARALLEL.md documentation Code quality improvements: - Replace fcntl with cross-platform filelock for Windows compatibility - Use threading.local() to cache ModelInference instances per thread - Remove redundant periodic cache consolidation (save_single_result handles it) - Remove os.chdir() in batch_inference.py, use absolute paths instead - Add logging for exceptions when converting model names (no silent failures) - Document clear_failed_entries as utility function (not auto-called)

- Call clear_failed_entries automatically before processing each model - Clear failed entries before processing so they can be retried - Update function documentation to reflect automatic usage - Log number of failed entries cleared for visibility

jiarong0907

lgtm. One comment: You can directly call it README since it appears in a different folder.

yl231 added 2 commits January 15, 2026 19:41

fix format

b1418ca

gemini-code-assist Bot reviewed Jan 16, 2026

View reviewed changes

yl231 added 4 commits January 15, 2026 20:34

Format code: fix line length and whitespace

c98f592

fix format

854dcea

yl231 requested a review from jiarong0907 January 16, 2026 02:44

jiarong0907 approved these changes Jan 16, 2026

View reviewed changes

renamed readme

f4440b2

yl231 merged commit a94b93f into main Jan 16, 2026
10 checks passed

yl231 deleted the feature/run-py-batch-inference-compatibility branch January 16, 2026 19:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat.] Refactor llm_inference/run.py to use ParallelInferenceManager with batch inference#59

[Feat.] Refactor llm_inference/run.py to use ParallelInferenceManager with batch inference#59
yl231 merged 7 commits into
mainfrom
feature/run-py-batch-inference-compatibility

yl231 commented Jan 16, 2026

Uh oh!

yl231 commented Jan 16, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiarong0907 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yl231 commented Jan 16, 2026

Uh oh!

yl231 commented Jan 16, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiarong0907 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants