Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
dream-coder.sh	dream-coder.sh
dream_gsm8k_cot.sh	dream_gsm8k_cot.sh
dream_humaneval.sh	dream_humaneval.sh
dream_long_gsm8k.sh	dream_long_gsm8k.sh
dream_math.sh	dream_math.sh
dream_mbpp.sh	dream_mbpp.sh
llada_gsm8k_cot.sh	llada_gsm8k_cot.sh
llada_humaneval.sh	llada_humaneval.sh
llada_long_gsm8k.sh	llada_long_gsm8k.sh
llada_math.sh	llada_math.sh
llada_mbpp.sh	llada_mbpp.sh

Name

Last commit message

Last commit date

README.md

Evaluation Scripts

Supported Methods

We include comprehensive evaluation code for:

✅ d3LLM (our method)
✅ AR Model (e.g., Qwen-2.5-7B-it) - Autoregressive baselines
✅ Vanilla LLaDA - Original LLaDA model
✅ Vanilla Dream - Original Dream model
✅ Fast-dLLM - Training-free acceleration with KV cache
✅ D2F - Discrete diffusion forcing
✅ dParallel - Distilled dLLMs
✅ Fast-dLLM v2 - Block-wise diffusion

Supported Benchmarks

# GSM8K
bash dream_gsm8k_cot.sh
bash llada_gsm8k_cot.sh

# MATH
bash dream_math.sh
bash llada_math.sh

# Code Generation (HumanEval & MBPP)
bash dream_humaneval.sh
bash dream_mbpp.sh
bash llada_humaneval.sh
bash llada_mbpp.sh
bash dream-coder.sh

# Long-Context GSM8K
bash dream_long_gsm8k.sh
bash llada_long_gsm8k.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Evaluation Scripts

Supported Methods

Supported Benchmarks

FilesExpand file tree

eval_scripts

Directory actions

More options

Directory actions

More options

Latest commit

History

eval_scripts

Folders and files

parent directory

README.md

Evaluation Scripts

Supported Methods

Supported Benchmarks