Pinned Loading
Repositories
Showing 10 of 29 repositories
- u-math Public
Official evaluation code for the U-MATH and μ-MATH benchmarks. These datasets are designed to test the mathematical reasoning and meta-evaluation capabilities of LLMs on university-level problems.
Toloka/u-math’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…
