Skip to content

Latest commit

 

History

History

README.md

Evaluation Scripts

Supported Methods

We include comprehensive evaluation code for:

Supported Benchmarks

# GSM8K
bash dream_gsm8k_cot.sh
bash llada_gsm8k_cot.sh

# MATH
bash dream_math.sh
bash llada_math.sh

# Code Generation (HumanEval & MBPP)
bash dream_humaneval.sh
bash dream_mbpp.sh
bash llada_humaneval.sh
bash llada_mbpp.sh
bash dream-coder.sh

# Long-Context GSM8K
bash dream_long_gsm8k.sh
bash llada_long_gsm8k.sh