Skip to content

[reward] fix: restore timeout in math_verify via ProcessPoolExecutor#5839

Open
MaxwellJryao wants to merge 2 commits intoverl-project:mainfrom
MaxwellJryao:fix/math-verify-thread-safety
Open

[reward] fix: restore timeout in math_verify via ProcessPoolExecutor#5839
MaxwellJryao wants to merge 2 commits intoverl-project:mainfrom
MaxwellJryao:fix/math-verify-thread-safety

Conversation

@MaxwellJryao
Copy link
Copy Markdown
Contributor

What does this PR do?

Follow-up to #5635. That PR fixed the signal.alarm() crash by disabling all timeouts (parsing_timeout=None, timeout_seconds=None). However, this causes the reward worker to hang indefinitely on adversarial or complex model outputs that trigger pathological parsing in math_verify.

This PR restores timeout protection by running math_verify in a subprocess via ProcessPoolExecutor, where signal.alarm() works normally (each subprocess has its own main thread). An outer future.result(timeout=30) provides a fallback guarantee.

Checklist Before Starting

Test

Validated on Qwen2.5-3B-Instruct GRPO training with MATH500 val set on 4-node cluster:

API and Usage Example

No API changes — same compute_score(model_output, ground_truth) signature. New optional timeout parameter (default 30s).

Design & Code Changes

  1. Run parse() + verify() in a subprocess via ProcessPoolExecutorsignal.alarm() works in each subprocess's main thread
  2. Enforce outer timeout with future.result(timeout=30) as fallback
  3. Lazy-initialized process pool (4 workers) with thread-safe singleton pattern
  4. Import math_verify inside subprocess function to avoid pickling issues

Checklist Before Submitting

…timeout

math_verify uses signal.alarm() for timeout, which only works in the
main thread. In verl's RewardLoopWorker, compute_score is called from
a thread pool (asyncio run_in_executor), causing signal.alarm() to
raise ValueError. The previous fix (verl-project#5635) disabled timeout entirely
(parsing_timeout=None), but this can cause hangs on adversarial inputs.

Run verification in a subprocess via ProcessPoolExecutor where
signal.alarm() works normally, with an outer future.result(timeout=30)
as a fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the math_verify utility to execute scoring within a subprocess using ProcessPoolExecutor, which prevents issues with signal.alarm() in multi-threaded environments like Ray. The review feedback highlights several critical improvements: defining a placeholder for TimeoutException to avoid NameError when the library is missing, using the spawn multiprocessing context to prevent deadlocks, and replacing a silent exception handler with basic logging to aid debugging.

Comment on lines 20 to 22
from math_verify.errors import TimeoutException
from math_verify.grader import verify
from math_verify.parser import ExprExtractionConfig, LatexExtractionConfig, parse
except ImportError:
print("To use Math-Verify, please install it first by running `pip install math-verify`.")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If math_verify is not installed, TimeoutException will not be defined in the global scope. This will lead to a NameError when compute_score attempts to catch it in the except block at line 58. Since this NameError is then caught by the broad except Exception at line 60, the function will silently return 0.0 without any indication of the missing dependency or the underlying error. Defining a placeholder exception ensures the code remains robust even when the package is missing.

Suggested change
from math_verify.errors import TimeoutException
from math_verify.grader import verify
from math_verify.parser import ExprExtractionConfig, LatexExtractionConfig, parse
except ImportError:
print("To use Math-Verify, please install it first by running `pip install math-verify`.")
from math_verify.errors import TimeoutException
except ImportError:
class TimeoutException(Exception):
pass
print("To use Math-Verify, please install it first by running pip install math-verify.")

if _pool is None:
with _pool_lock:
if _pool is None:
_pool = ProcessPoolExecutor(max_workers=4)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using ProcessPoolExecutor with the default start method (which is fork on Linux) can be dangerous in multi-threaded environments like Ray workers, as it may lead to deadlocks or inconsistent state in the child processes if the parent process has active threads during the fork. It is safer to use the spawn start method by providing an explicit multiprocessing context.

Suggested change
_pool = ProcessPoolExecutor(max_workers=4)
import multiprocessing
_pool = ProcessPoolExecutor(max_workers=4, mp_context=multiprocessing.get_context('spawn'))

Comment on lines 60 to 61
except Exception:
pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The broad except Exception: pass block swallows all potential errors, including critical ones like PicklingError (if arguments cannot be sent to the subprocess) or the NameError mentioned previously. This makes debugging extremely difficult in a distributed training environment. It is recommended to at least log the exception or catch more specific error types.

Suggested change
except Exception:
pass
except Exception as e:
print(f"Error in math_verify compute_score: {e}")

- Define TimeoutException placeholder when math_verify is not installed
- Use spawn mp_context to avoid fork deadlocks in threaded environments
- Log exceptions instead of silently swallowing them
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants