Example of overthinking in tool reasoning: the model reaches a correct solution but continues reasoning and produces an incorrect final output.
Small reasoning models (SRMs) often overthink during tool use: they reach a correct tool-argument configuration, then continue reasoning and overwrite it with an incorrect final call. ThinkBrake is a training-free decoding heuristic that addresses this issue by monitoring the log-probability margin between </think> and the current top token at sentence boundaries, triggering early termination when the margin becomes small.
https://github.com/holi-lab/ThinkBrake.git
cd think-brakeUsing uv (recommended):
bash install.shuv init . -p 3.10
uv venv -p 3.10
source .venv/bin/activate
pip install -e .
pip install bfcl-eval vllm flashinfer-pythonSet the output directory for experiment results:
export THINK_BRAKE_PROJECT_ROOT=/path/to/your/outputsRun the generation script with your desired model and test categories:
python scripts/generate.py \
--model Qwen/Qwen3-4B-Thinking-2507 \
--test-category non_live live \
--temperature 0.7 \
--gpu-memory-utilization 0.95Available test categories:
non_live: Simple function calling taskslive: Live function calling taskssingle_turn: All single-turn categories- Individual categories:
simple_python,simple_java,simple_javascript,multiple,parallel,parallel_multiple,live_simple,live_multiple,live_parallel,live_parallel_multiple
Evaluate the generated predictions:
python scripts/evaluate.py \
--model Qwen/Qwen3-4B-Thinking-2507 \
--test-category non_live live \
--threshold 0.25For convenience, you can use the provided shell scripts:
# Generate predictions
bash run_generate.sh
# Evaluate results
bash run_evaluate.shEdit these scripts to customize other parameters.
Currently supported models:
- Qwen3-4B-Thinking-2507: Qwen/Qwen3-4B-Thinking-2507
- Qwen3-0.6B: Qwen/Qwen3-0.6B
- Qwen3-1.7B: Qwen/Qwen3-1.7B
- Qwen3-8B: Qwen/Qwen3-8B
@article{oh2025thinkbrake,
title={ThinkBrake: Mitigating Overthinking in Tool Reasoning},
author={Minjae Oh and Sangjun Song and Seungkyu Lee and Sungmin Jo and Yohan Jo},
year={2025},
eprint={2510.00546},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2510.00546},
}