Skip to content

serve_prm scoring service with vLLM module #4

@pranay5255

Description

@pranay5255

Summary

Build serve_prm as the scoring service for branch prefixes. The interface should be usable from both offline eval and a future live forest reranker.

Scope

  • Load a trained PRM checkpoint and expose batch prefix scoring.
  • Support a clean API contract for branch_id, prefix, reward, rank_score, and optional rationale.
  • Start with a reliable torch/transformers path; add vLLM-backed tokenization / generation integration where it makes sense, but do not force custom reward heads into a pure vLLM-only design if that makes the system brittle.
  • Add a small client usable from eval scripts and future controller/runtime work.
  • Add local smoke tests against saved prefixes from extracted traces.

Modules to build

  • project/evmbench/evmbench/experiments/serve_prm.py
  • project/evmbench/evmbench/experiments/prm_service.py
  • project/evmbench/evmbench/experiments/prm_api.py
  • project/evmbench/evmbench/experiments/prm_client.py
  • project/evmbench/evmbench/experiments/vllm_backend.py

Acceptance criteria

  • Given saved prefix inputs, the service returns deterministic scores with the expected schema.
  • Batch scoring works for multiple live branches.
  • There is a documented backend story for custom reward heads vs vLLM generation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions