Skip to content

[Feat.] Refactor llm_inference/run.py to use ParallelInferenceManager with batch inference#59

Merged
yl231 merged 7 commits into
mainfrom
feature/run-py-batch-inference-compatibility
Jan 16, 2026
Merged

[Feat.] Refactor llm_inference/run.py to use ParallelInferenceManager with batch inference#59
yl231 merged 7 commits into
mainfrom
feature/run-py-batch-inference-compatibility

Conversation

@yl231

@yl231 yl231 commented Jan 16, 2026

Copy link
Copy Markdown
Contributor
  • Replace sequential processing with parallel batch inference infrastructure
  • Maintain compatibility with origin/main: same CLI interface and functionality
  • Add --num-runs parameter support (default: 1) to match batch_inference.py
  • Use ParallelInferenceManager for efficient parallel processing with workers
  • Add README_PARALLEL.md documentation

yl231 added 2 commits January 15, 2026 19:41
- Replace sequential processing with parallel batch inference infrastructure
- Maintain compatibility with origin/main: same CLI interface and functionality
- Add --num-runs parameter support (default: 1) to match batch_inference.py
- Use ParallelInferenceManager for efficient parallel processing with workers
- Fix mypy type annotations: Tuple return types and Optional parameters
- Add README_PARALLEL.md documentation
@yl231

yl231 commented Jan 16, 2026

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-executed refactoring of the LLM inference process, replacing the previous sequential logic with a robust parallel batch inference system. The new ParallelInferenceManager and batch_inference.py script are excellent additions that will greatly improve performance and maintainability. The changes are well-structured, and the new documentation is very helpful. My review includes suggestions to enhance cross-platform compatibility, simplify some of the concurrency logic for better clarity, and improve error handling and efficiency in a few areas.

Comment thread llm_inference/parallel_inference.py Outdated
Comment thread llm_inference/README_PARALLEL.md Outdated
Comment thread llm_inference/batch_inference.py Outdated
Comment thread llm_inference/parallel_inference.py
Comment thread llm_inference/parallel_inference.py Outdated
Comment thread llm_inference/parallel_inference.py Outdated
Comment thread llm_inference/run.py
yl231 added 4 commits January 15, 2026 20:34
- Replace sequential processing with parallel batch inference infrastructure
- Maintain compatibility with origin/main: same CLI interface and functionality
- Add --num-runs parameter support (default: 1) to match batch_inference.py
- Use ParallelInferenceManager for efficient parallel processing with workers
- Fix mypy type annotations: Tuple return types and Optional parameters
- Add README_PARALLEL.md documentation

Code quality improvements:
- Replace fcntl with cross-platform filelock for Windows compatibility
- Use threading.local() to cache ModelInference instances per thread
- Remove redundant periodic cache consolidation (save_single_result handles it)
- Remove os.chdir() in batch_inference.py, use absolute paths instead
- Add logging for exceptions when converting model names (no silent failures)
- Document clear_failed_entries as utility function (not auto-called)
- Call clear_failed_entries automatically before processing each model
- Clear failed entries before processing so they can be retried
- Update function documentation to reflect automatic usage
- Log number of failed entries cleared for visibility
@yl231 yl231 requested a review from jiarong0907 January 16, 2026 02:44

@jiarong0907 jiarong0907 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. One comment: You can directly call it README since it appears in a different folder.

@yl231 yl231 merged commit a94b93f into main Jan 16, 2026
10 checks passed
@yl231 yl231 deleted the feature/run-py-batch-inference-compatibility branch January 16, 2026 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants