Deep Learning for Geometry Problem Solving (DL4GPS)

This repository is the reading list on A Survey of Deep Learning for Geometry Problem Solving (DL4GPS). We will update the papers after a certain period of time. The current deadline for included papers is April 2025.

🔵 indicates that the work is not specifically designed for geometry problems.
🔺 represents geometry tasks other than geometry problem solving.
❌ indicates no deep learning method is used.

For more details, please refer to the paper: A Survey of Deep Learning for Geometry Problem Solving.

🔔 If you have any suggestions or notice something we missed, please don't hesitate to let us know. You can directly email Jianzhe Ma (majianzhe@ruc.edu.cn), or post an issue on this repo.

Surveys
Tasks and Datasets - Fundamental Tasks
Tasks and Datasets - Core Tasks
- Geometry Theorem Proving
- Geometric Numerical Calculation
Tasks and Datasets - Composite Tasks
- Mathematical Reasoning
Tasks and Datasets - Other Geometry Tasks
Methods - Architectures
- Encoder-Decoder
- Other Architectures
Methods - Training Stage
Methods - Inference Stage
- Test-Time Scaling
- Knowledge-Augmented Inference
Related Surveys
Years
- 2014
- 2015
- 2016
- 2017
- 2018
- 2019
- 2020
- 2021
- 2022
- 2023
- 2024
- 2025
Citation

Surveys

Plane Geometry Problem Solving with Multi-modal Reasoning: A Survey, arXiv:2505.14340 [paper]
Towards Geometry Problem Solving in the Large Model Era: A Survey, arXiv:2506.02690 [paper]

Tasks and Datasets - Fundamental Tasks

Geometry Diagram Understanding

2D Geometric Shapes Dataset – For Machine Learning and Pattern Recognition, Data in Brief 2020 [paper] [2Dgeometricshapes data]
Geoclidean: Few-Shot Generalization in Euclidean Geometry, NeurIPS 2022 [paper] [Geoclidean data]
Euclid: Supercharging Multimodal LLMs With Synthetic High-Fidelity Visual Descriptions, arXiv:2412.08737 [paper] [Geoperception data]
GePBench: Evaluating Fundamental Geometric Perception for Multimodal Large Language Models, arXiv:2412.21036 [paper] [GePBench data]
Do Large Language Models Truly Understand Geometric Structures?, ICLR 2025 [paper] [GeomRel data]
Improving Multimodal LLMs Ability In Geometry Problem Solving, Reasoning, And Multistep Scoring, arXiv:2412.00846 [paper]
Advancing Multimodal LLMs: A Focus on Geometry Problem Solving Reasoning and Sequential Scoring, MMASIA 2024 [paper]
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model, ICLR 2025 [paper] [Geo170K-alignment data]
GOLD: Geometry Problem Solver With Natural Language Description, Findings of NAACL 2024 [paper]
AutoGeo: Automating Geometric Image Dataset Creation for Enhanced Geometry Understanding, IEEE Trans. Multimedia 2025 [paper] [AutoGeo-100k data]
GeoDANO: Geometric VLM with Domain Agnostic Vision Encoder, arXiv:2502.11360 [paper] [VGPR data]
Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver, ICASSP 2025 [paper] [SynthGeo228K/formalgeo-structure774k data]
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-Training, ICLR 2025 [paper] [GeoX-alignment data]
Decomposing Complex Visual Comprehension Into Atomic Visual Skills for Vision Language Models, NeurIPS 2024 MATH-AI Workshop [paper] [AVSBench data] 🔵
VisOnlyQA: Large Vision Language Models Still Struggle With Visual Perception of Geometric Information, arXiv:2412.00947 [paper] [VisOnlyQA data] 🔵
VisNumBench: Evaluating Number Sense of Multimodal Large Language Models, arXiv:2503.14939 [paper] [VisNumBench data] 🔵
MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams, arXiv:2503.20745 [paper] [MATHGLANCE/GeoPeP data]
Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding, Findings of ACL 2025 [paper] [CogAlign-Probing/CogAlign-train data]
Retrieving Geometric Information from Images: The Case of Hand-Drawn Diagrams, KDD 2017 [paper] ❌
A Novel Geometric Information Retrieval Tool for Images of Geometric Diagrams, ICISE-IE 2020 [paper]
A Paradigm of Diagram Understanding in Problem Solving, TALE 2021 [paper] ❌
Plane Geometry Diagram Parsing, IJCAI 2022 [paper] [PGDP5K data]
Learning to Understand Plane Geometry Diagram, NeurIPS 2022 MATH-AI Workshop [paper] [PGDP5K data]
PGDP5K: A Diagram Parsing Dataset for Plane Geometry Problems, ICPR 2022 [paper] [PGDP5K data]
Usage of Stacked Long Short-Term Memory for Recognition of 3D Analytic Geometry Elements, ICAART 2022 [paper]
Solving Algebraic Problems with Geometry Diagrams Using Syntax-Semantics Diagram Understanding, Computers, Materials & Continua 2023 [paper] ❌
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them, Findings of ACL 2023 [paper] [BBH-geometricshapes data]
2D Shape Detection for Solving Geometry Word Problems, IETE J. Res. 2024 [paper] ❌
Slow Perception: Let's Perceive Geometric Figures Step-by-Step, arXiv:2412.20631 [paper] [SP-1 data]
Leveraging Two-Level Deep Learning Classifers for 2D Shape Recognition to Automatically Solve Geometry Math Word Problems, PAA 2024 [paper] [GeoCQT data]
Tangram: A Challenging Benchmark for Geometric Element Recognizing, arXiv:2408.13854 [paper] [Tangram data]
CurveML: A Benchmark for Evaluating and Training Learning-Based Methods of Classification, Recognition, and Fitting of Plane Curves, Visual Comput 2024 [paper] [CurveML data]
ElementaryCQT: A New Dataset and Its Deep Learning Analysis for 2D Geometric Shape Recognition, SN Comput. Sci. 2025 [paper] [ElementaryCQT data]
Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver, ICASSP 2025 [paper] [SynthGeo228K/formalgeo-structure774k data]

Semantic Parsing for Geometry Problem

Semantic Parsing of Pre-University Math Problems, ACL 2017 [paper] ❌
Beyond Sentential Semantic Parsing: Tackling the Math SAT with a Cascade of Tree Transducers, EMNLP 2017 [paper] 🔵 ❌
From Textbooks to Knowledge: A Case Study in Harvesting Axiomatic Knowledge from Textbooks to Solve Geometry Problems, EMNLP 2017 [paper] ❌
Discourse in Multimedia: A Case Study in Extracting Geometry Knowledge from Textbooks, CL 2019 [paper] ❌
Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples, ACL 2018 [paper]
A Neural Semantic Parser for Math Problems Incorporating Multi-Sentence Information, TALLIP 2019 [paper] 🔵
Two-step memory networks for deep semantic parsing of geometry word problems, SOFSEM 2020 [paper]
Semantic parsing of geometry statements using supervised machine learning on Synthetic Data, NatFoM 2021 CICM Workshop [paper]
Cognitive Patterns for Semantic Presentation of Natural-Language Descriptions of Well-Formalizable Problems, RCAI 2021 [paper] ❌
Exploration of Formalization Techniques for Geometric Entities in Planar Geometry Proposition Texts, JAIP 2025 [paper]
Extracting structured information from the textual description of geometry word problems, NLPIR 2023 [paper] ❌
Automatic Extraction of Structured Information from Elementary Level Geometry Questions into Logic Forms, Multimed Tools Appl 2024 [paper] [ElementaryGeometryQA data]
Evaluating Automated Geometric Problem Solving With Formal Language Generation on Large Multimodal Models, IEIR 2024 [paper]
FGeo-Parser: Autoformalization and Solution of Plane Geometric Problems, Symmetry 2025 [paper]

Geometric Relation Extraction

Diagram Understanding in Geometry Questions, AAAI 2014 [paper] ❌
Understanding Plane Geometry Problems by Integrating Relations Extracted from Text and Diagram, PSIVT 2017 [paper] [GeoC50 data] ❌
Understanding Explicit Arithmetic Word Problems and Explicit Plane Geometry Problems Using Syntax-Semantics Models, IALP 2017 [paper] ❌
Automatic Understanding and Formalization of Natural Language Geometry Problems Using Syntax-Semantics Models, IJICIC 2018 [paper] ❌
Automatic Understanding and Formalization of Plane Geometry Proving Problems in Natural Language: A Supervised Approach, IJAIT 2019 [paper] ❌
GeoRE: A relation extraction dataset for chinese geometry problems, NeurIPS 2021 MATHAI4ED Workshop [paper] [GeoRE data]
A Novel Geometry Problem Understanding Method based on Uniform Vectorized Syntax-Semantics Model, IEIR 2022 [paper]
Research on Geometry Problem Text Understanding Based on Bidirectional LSTM-CRF, ICDH 2022 [paper]
A Knowledge and Semantic Fusion Method for Automatic Geometry Problem Understanding, Appl. Sci. 2025 [paper]

Geometric Knowledge Prediction

Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning, ACL 2021 [paper]
E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator, CVPR 2024 [paper]
GNS: Solving Plane Geometry Problems by Neural-Symbolic Reasoning with Multi-Modal LLMs, AAAI 2025 [paper] [GNS-260K data]
GeoSense: Evaluating Identification and Application of Geometric Principles in Multimodal Reasoning, arXiv:2504.12597 [paper] [GeoSense data]

Tasks and Datasets - Core Tasks

UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression, EMNLP 2022 [paper] [UniGeo data]
FormalGeo: An Extensible Formalized Framework for Olympiad Geometric Problem Solving, arXiv:2310.18021 [paper] [formalgeo7k/formalgeo-imo data] ❌
GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation, EMNLP 2024 [paper] [GeoGPT4V-GPS data]
GeoVQA: A Comprehensive Multimodal Geometry Dataset for Secondary Education, MIPR 2024 [paper] [GeoVQA data]
Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems With Meta In-Context Learning, LGM3A 2024 [paper] [GeoMath data]
Advancing Multimodal LLMs: A Focus on Geometry Problem Solving Reasoning and Sequential Scoring, MMASIA 2024 [paper] [GPSM4K data]
Improving Multimodal LLMs Ability In Geometry Problem Solving, Reasoning, And Multistep Scoring, arXiv:2412.00846 [paper] [GPSM4K data]
Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems with Meta In-Context Learning, LGM3A 2024 [paper] [GeoMath data]
Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration, arXiv:2504.12773 [paper] [GeoExpand/GeoSynth data]

Geometry Theorem Proving

A Paradigm of Diagram Understanding in Problem Solving, TALE 2021 [paper] [Proving2H data] ❌
Solving Olympiad Geometry Without Human Demonstrations, Nature 2024 [paper] [IMO-AG-30 data]
Wu’s Method Boosts Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry, NeurIPS 2024 MATH-AI Workshop [paper]
Proposing and Solving Olympiad Geometry with Guided Tree Search, arXiv:2412.10673 [paper] [MO-TG-225 data]
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2, arXiv:2502.03544 [paper] [IMO-AG-50 data]

Geometric Numerical Calculation

Solving Geometry Problems: Combining Text and Diagram Interpretation, EMNLP 2015 [paper] GEOS data] ❌
From Textbooks to Knowledge: A Case Study in Harvesting Axiomatic Knowledge from Textbooks to Solve Geometry Problems, EMNLP 2017 [paper] [GEOS++ data] ❌
Discourse in Multimedia: A Case Study in Extracting Geometry Knowledge from Textbooks, CL 2019 [paper] [GEOS++ data] ❌
Learning to Solve Geometry Problems from Natural Language Demonstrations in Textbooks, *SEM 2017 [paper] [GEOS-OS] ❌
Synthesis of Solutions for Shaded Area Geometry Problems, FLAIRS 2017 [paper] [GeoShader data] ❌
Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning, ACL 2021 [paper] [Geometry3K data]
GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning, Findings of ACL 2021 [paper] [GeoQA data]
Solving Solid Geometric Calculation Problems in Text, TALE 2021 [paper] [Geometry3Dcalculation data] ❌
Solving Shaded Area Problems by Constructing Equations, AIET 2021 [paper] ❌
Sequence to General Tree Knowledge-Guided Geometry Word Problem Solving, ACL-IJCNLP 2021 [paper] [GeometryQA data]
An Augmented Benchmark Dataset for Geometric Question Answering through Dual Parallel Text Encoding, COLING 2022 [paper] [GeoQA+ data]
Beyond the Imitation Game: Quantifying and Extrapolating the Capabilities of Language Models, TMLR 2022 [paper] [BIG-bench-IG data] 🔵
A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from Diagram, IJCAI 2023 [paper] [PGPS9K data]
Conic10K: A Challenging Math Problem Understanding and Reasoning Dataset, Findings of EMNLP 2023 [paper] [Conic10K data]
GeomVerse: A Systematic Evaluation of Large Models for Geometric Reasoning, ICML 2024 AI4MATH Workshop [paper] [GeomVerse data]
E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator, CVPR 2024 [paper] [aug-Geo3K data]
GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving, Findings of ACL 2024 [paper] [GeoEval data]
R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models, arXiv:2410.17885 [paper] [GeoMM data]
An Enhanced Relation-Flow Algorithm for Solving Number Line Problems, IEIR 2024 [paper] [NBLP data] ❌
BBA: Bi-Modal Behavioral Alignment for Reasoning with Large Vision-Language Models, Findings of ACL 2024 [paper] [G-MATH data]
Is Your Model Really a Good Math Reasoner? Evaluating Mathematical Reasoning With Checklist, arXiv:2407.08733 [paper] [MATHCHECK-GEO data]
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model, ICLR 2025 [paper] [ Geo170K-qadata]
TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving, arXiv:2504.15780 [paper] [GeoTrust data]
FGeo-Parser: Autoformalization and Solution of Plane Geometric Problems, Symmetry 2025 [paper] [FormalGeo7K-v2 data]
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL, arXiv:2503.07536 [paper] [VerMulti-Geo data]
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning, arXiv:2503.20752 [paper] [GeoMath-8K data] 🔵
GNS: Solving Plane Geometry Problems by Neural-Symbolic Reasoning with Multi-Modal LLMs, AAAI 2025 [paper] [GNS-260K data]
GeoSense: Evaluating Identification and Application of Geometric Principles in Multimodal Reasoning, arXiv:2504.12597 [paper] [GeoSense data]
Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver, ICASSP 2025 [paper] [formalgeo-reasoning238k data]

Tasks and Datasets - Composite Tasks

Mathematical Reasoning

Measuring Mathematical Problem Solving With the MATH Dataset, NeurIPS 2021 [paper] [MATH/AMPS data] 🔵
NUMGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks, ACL 2022 [paper] [NUMGLUE data] 🔵
Lila: A Unified Benchmark for Mathematical Reasoning, EMNLP 2022 [paper] [Lila data] 🔵
It Ain’t Over: A Multi-Aspect Diverse Math Word Problem Dataset, EMNLP 2023 [paper] [DMath data] 🔵
TheoremQA: A Theorem-driven Question Answering Dataset, EMNLP 2023 [paper] [TheoremQA data] 🔵
M3exam: A multilingual, multimodal, multilevel benchmark for examining large language models, NeurIPS 2023 [paper] [M3Exam data] 🔵
OlympiadBench: A Challenging Benchmark for Promoting AGI With Olympiad-Level Bilingual Multimodal Scientific Problems, ACL 2024 [paper] [OlympiadBench data] 🔵
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts, ICLR 2024 [paper] [MathVista data] 🔵
MathVerse: Does Your Multi-Modal LLM Truly See the Diagrams in Visual Math Problems?, ECCV 2024 [paper] [MathVerse data] 🔵
Measuring Multimodal Mathematical Reasoning With MATH-Vision Dataset, NeurIPS 2024 [paper] [MATH-Vision data] 🔵
MM-MATH: Advancing Multimodal Math Evaluation With Process Evaluation and Fine-Grained Classification, Findings of EMNLP 2024 [paper] [MM-MATH data] 🔵
We-Math: Does Your Large Multimodal Model Achieve Human-Like Mathematical Reasoning?, arXiv:2407.01284 [paper] [We-Math data] 🔵
VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning, arXiv:2410.22995 [paper] [VisAidMath data] 🔵
CMM-Math: A Chinese Multimodal Math Dataset to Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models, arXiv:2409.02834 [paper] [CMM-Math data] 🔵
MathScape: Evaluating MLLMs in Multimodal Math Scenarios Through a Hierarchical Benchmark, arXiv:2408.07543 [paper] [MathScape data] 🔵
VisScience: An Extensive Benchmark for Evaluating K12 Educational Multi-Modal Scientific Reasoning, arXiv:2409.13730 [paper] [VisScience data] 🔵
Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models, ACL 2024 [paper] [ArXivQA data] 🔵
ReMI: A Dataset for Reasoning With Multiple Images, NeurIPS 2024 [paper] [ReMI data] 🔵
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models, EMNLP Findings 2024 [paper] [MathV360K data] 🔵
MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models, arXiv:2409.00147 [paper] [MultiMath-300K data] 🔵
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning, NeurIPS 2024 MATH-AI Workshop [paper] [InfiMM-WebMath-40B data] 🔵
MathGLM-Vision: Solving Mathematical Problems With Multi-Modal Large Language Model, arXiv:2409.13729 [paper] [MathVL data] 🔵
Mathematical Problem Solving in Arabic: Assessing Large Language Models, Procedia Comput. Sci. 2024 [paper] [ArMATH data] 🔵
M3CoT: A Novel Benchmark for Multi-Domain Multi-Step Multi-Modal Chain-of-Thought, ACL 2024 [paper] [M3CoT data] 🔵
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data, arXiv:2406.18321 [paper] [MathOdyssey data] 🔵
PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition, NeurIPS 2024 [paper] [PutnamBench data] 🔵
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models, Findings of ACL 2024 [paper] [ConceptMath data] 🔵
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap, arXiv:2402.19450 [paper] [MATH() data] 🔵
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark, Findings of ACL 2024 [paper] [MathBench data] 🔵
HARP: A Challenging Human-Annotated Math Reasoning Benchmark, arXiv:2412.08819 [paper] [HARP data] 🔵
M3GIA: A Cognition-Inspired Multilingual and Multimodal General Intelligence Ability Benchmark, arXiv:2406.05343 [paper] [M3GIA data] 🔵
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving, NeurIPS 2024 [paper] [DART-Math data] 🔵
MathScale: Scaling Instruction Tuning for Mathematical Reasoning, ICML 2024 [paper] [MathScaleQA data] 🔵
UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts, arXiv:2411.07240 [paper] [UTMath data] 🔵
MultiLingPoT: Enhancing Mathematical Reasoning with Multilingual Program Fine-tuning, arXiv:2412.12609 [paper] [MultiLingPoT data] 🔵
System-2 Mathematical Reasoning via Enriched Instruction Tuning, arXiv:2412.16964 [paper] [EITMath data] 🔵
AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning, arXiv:2411.11930 [paper] [AMATH-SFT data] 🔵
Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?, arXiv:2503.06252 [paper] [AMATH-SFT data] 🔵
URSA: Understanding and Verifying Chain-of-Thought Reasoning in Multimodal Mathematics, arXiv:2501.04686 [paper] [MMathCoT-1M data] 🔵
DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models, ICLR 2025 [paper] [DynaMath data] 🔵
CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models, AAAI 2025 [paper] [CoMT data] 🔵
Feynman: Knowledge-Infused Diagramming Agent for Scaling Visual Reasoning Data, openreview 2025 [paper] [Diagramma data] 🔵
MV-MATH: Evaluating Multimodal Math Reasoning in Multi-Visual Contexts, arXiv:2502.20808 [paper] [MV-MATH data] 🔵
CMMaTH: A Chinese Multi-modal Math Skill Evaluation Benchmark for Foundation Models, COLING 2025 [paper] [CMMaTH data] 🔵
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning, AAAI 2025 [paper] [Math-PUMA-1M data] 🔵
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search, arXiv:2503.10582 [paper] [VisualWebInstruct data] 🔵
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine, ICLR 2025 [data] [MAVIS-Instruct data]
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark for Large Language Models, ICLR 2025 [paper] [Omni-MATH data] 🔵
MathConstruct: Challenging LLM Reasoning with Constructive Proofs, ICLR 2025 VerifAI Workshop [paper] [MathConstruct data] 🔵
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency, arXiv:2504.18589 [paper] [VCBench data] 🔵
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models, arXiv:2503.21380 [paper] [OlymMATH data] 🔵
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?, arXiv:2504.00509 [paper] [RoR-Bench data] 🔵
PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts, arXiv:2504.18428 [paper] [PolyMath data] 🔵
LLMs Are Not Intelligent Thinkers: Introducing Mathematical Topic Tree Benchmark for Comprehensive Evaluation of LLMs, NAACL 2025 [paper] [MaTT data] 🔵
Who's the MVP? A Game-Theoretic Evaluation Benchmark for Modular Attribution in LLM Agents, arXiv:2502.00510 [paper] [CapaBench data] 🔵
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations, ICLR 2025 LLM Reason&Plan Workshop [paper] [MATH-Perturb data] 🔵
Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning, arXiv:2504.09772 [paper] [M500 data] 🔵
Key-Point-Driven Data Synthesis with Its Enhancement on Mathematical Reasoning, AAAI 2025 [paper] [KPMATH-M data] 🔵
MathFlow: Enhancing the Perceptual Flow of MLLMs for Visual Mathematical Problems, arXiv:2503.16549 [paper] [FlowVerse data] 🔵

Tasks and Datasets - Other Geometry Tasks

Geometric Diagram Generation

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-Training, ICLR 2025 [paper] [GeoX-pretrain data]
Automatic Reconstruction of Plane Geometry Figures in Documents, EITT 2015 [paper] 🔺 ❌
Solid Geometric Object Reconstruction from Single Line Drawing Image, GRAPP 2015 [paper] 🔺 ❌
Recovering Solid Geometric Object from Single Line Drawing Image, Multimed Tools Appl 2016 [paper] 🔺 ❌
An Example-based Approach to 3D Man-made Object Reconstruction from Line Drawings, Pattern Recogn 2016 [paper] 🔺 ❌
Context-aware Geometric Object Reconstruction for Mobile Education, MM 2016 [paper] 🔺 ❌
Automated Generation of Illustrations for Synthetic Geometry Proofs, ADG 2021 [paper] 🔺 ❌
Automatically Building Diagrams for Olympiad Geometry Problems, CADE 2021 [paper] [GMBL data] 🔺 ❌
A Precise Text-to-Diagram Generation Method for Elementary Geometry, ICCWAMTIP 2023 [paper] 🔺
MagicGeo: Training-Free Text-Guided Geometric Diagram Generation, arXiv:2502.13855 [paper] [MagicGeoBench data] 🔺
GeoUni: A Unified Model for Generating Geometry Diagrams, Problems and Problem Solutions, arXiv:2504.10146 [paper]

Geometric Construction Problem

Learning to Solve Geometric Construction Problems from Images, CICM 2021 [paper] 🔺
EuclidNet: Deep Visual Reasoning for Constructible Problems in Geometry, AIML 2023 [paper] 🔺
Beyond Lines and Circles Unveiling the Geometric Reasoning Gap in Large Language Models, Findings of EMNLP 2024 [paper] [Euclidea/PyEuclidea data] 🔺

Geometric Figure Retrieval

Plane Geometry Figure Retrieval Based on Bilayer Geometric Attributed Graph Matching, ICPR 2014 [paper] 🔺 ❌
Plane Geometry Figure Retrieval with Bag of Shapes, IAPR 2014 DAS Workshop [paper] 🔺 ❌
Plane Geometry Diagram Retrieval by Using Hierarchical Searching Strategy, ICIMCS 2016 [paper] 🔺 ❌
Analysis of Stroke Intersection for Overlapping PGF Elements, IAPR 2016 DAS Workshop [paper] 🔺 ❌
Improving PGF retrieval effectiveness with active learning, ICPR 2016 [paper] 🔺 ❌
Improving Retrieval of Plane Geometry Figure with Learning to Rank, PTRL 2016 [paper] 🔺 ❌

Geometric Autoformalization

Autoformalizing Euclidean Geometry, ICML 2024 [paper] [LeanEuclid data] 🔺

Methods - Architectures

Encoder-Decoder

Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning, ACL 2021 [paper]
GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning, Findings of ACL 2021 [paper]
Sequence to General Tree Knowledge-Guided Geometry Word Problem Solving, ACL-IJCNLP 2021 [paper]
A Graph Convolutional Network Feature Learning Framework for Interpretable Geometry Problem Solving, IEIR 2022 [paper]
An Augmented Benchmark Dataset for Geometric Question Answering through Dual Parallel Text Encoding, COLING 2022 [paper]
UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression, EMNLP 2022 [paper]
Solving Geometry Problems via Feature Learning and Contrastive Learning of Multimodal Data, CMES 2023 [paper]
A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from Diagram, IJCAI 2023 [paper]
UniMath: A Foundational and Multimodal Mathematical Reasoner, EMNLP 2023 [paper] 🔵
Interpretable Geometry Problem Solving Using Improved RetinaNet and Graph Convolutional Network, Electronics 2023 [paper]
A Symbolic Characters Aware Model for Solving Geometry Problems, MM 2023 [paper]
The Geometric Neural Solution Combined with Text Diagram Parsing, IEIR 2023 [paper]
SUFFI-GPSC: Sufficient Geometry Problem Solution Checking with Symbolic Computation and Logical Reasoning, ICCWAMTIP 2023 [paper]
LANS: A Layout-Aware Neural Solver for Plane Geometry Problem, Findings of ACL 2024 [paper]
GAPS: Geometry-Aware Problem Solver, arXiv:2401.16287 [paper]
E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator, CVPR 2024 [paper]
FGeo-TP: A Language Model-Enhanced Solver for Geometry Problems, Symmetry 2024 [paper]
FGeo-DRL: Deductive Reasoning for Geometric Problems Through Deep Reinforcement Learning, Symmetry 2024 [paper]
FGeo-HyperGNet: Geometric Problem Solving Integrating Formal Symbolic System and Hypergraph Neural Network, arXiv:2402.11461 [paper]
GOLD: Geometry Problem Solver With Natural Language Description, Findings of NAACL 2024 [paper]
Learning to Solve Geometry Problems via Simulating Human Dual-Reasoning Process, IJCAI 2024 [paper]
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models, EMNLP Findings 2024 [paper] 🔵
Fuse, Reason and Verify: Geometry Problem Solving With Parsed Clauses From Diagram, arXiv:2407.07327 [paper]
EAGLE: Elevating Geometric Reasoning Through LLM-Empowered Visual Instruction Tuning, arXiv:2408.11397 [paper]
MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models, arXiv:2409.00147 [paper] 🔵
MathGLM-Vision: Solving Mathematical Problems With Multi-Modal Large Language Model, arXiv:2409.13729 [paper] 🔵
Enhancing Geometry Problem Solving With Attention Mechanism and Super-Resolution, ICBASE 2024 [paper]
Geo-Qwen: A Geometry Problem-Solving Method Based on Generative Large Language Models and Heuristic Reasoning, ICCWAMTIP 2024 [paper]
Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems With Meta In-Context Learning, LGM3A 2024 [paper]
A Geometric Neural Solving Method Based on a Diagram Text Information Fusion Analysis, Sci. Rep. 2024 [paper]
Maths: Multimodal Transformer-Based Human-Readable Solver, ICME 2024 [paper]
R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models, arXiv:2410.17885 [paper]
SANS: Spatial-Aware Neural Solver for Plane Geometry Problem, ICPR 2024 [paper]
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model, ICLR 2025 [paper]
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine, ICLR 2025 [paper]
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-Training, ICLR 2025 [paper]
Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver, ICASSP 2025 [paper]
GeoDANO: Geometric VLM with Domain Agnostic Vision Encoder, arXiv:2502.11360 [paper]
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning, AAAI 2025 [paper] 🔵
GeoCoder: Solving Geometry Problems by Generating Modular Code through Vision-Language Models, Findings of NAACL 2025 [paper]
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search, arXiv:2503.10582 [paper] [VisualWebInstruct data] 🔵
GNS: Solving Plane Geometry Problems by Neural-Symbolic Reasoning with Multi-Modal LLMs, AAAI 2025 [paper]
Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration, arXiv:2504.12773 [paper]
RedStar: Does Scaling Long-CoT Data Unlock Better Slow-Reasoning Systems?, arXiv:2501.11284 [paper]
Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMs, arXiv:2501.06430 [paper]
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL, arXiv:2503.07536 [paper]

Other Architectures

Geometry Problem Solving Based on Counter-factual Evolutionary Reasoning, CASE 2023 [paper]
GeoDRL: A Self-Learning Framework for Geometry Problem Solving using Reinforcement Learning in Deductive Reasoning, ACL 2023 [paper]
Hologram Reasoning for Solving Algebra Problems With Geometry Diagrams, arXiv:2408.10592 [paper]
Solving Olympiad Geometry Without Human Demonstrations, Nature 2024 [paper]
Wu’s Method Boosts Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry, NeurIPS 2024 MATH-AI Workshop [paper]
Proposing and Solving Olympiad Geometry with Guided Tree Search, arXiv:2412.10673 [paper]
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2, arXiv:2502.03544 [paper]
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving, NeurIPS 2024 [paper] 🔵
MultiLingPoT: Enhancing Mathematical Reasoning with Multilingual Program Fine-tuning, arXiv:2412.12609 [paper] 🔵
MathScale: Scaling Instruction Tuning for Mathematical Reasoning, ICML 2024 [paper] 🔵
System-2 Mathematical Reasoning via Enriched Instruction Tuning, arXiv:2412.16964 [paper] 🔵
Pi-GPS: Enhancing Geometry Problem Solving by Unleashing the Power of Diagrammatic Information, arXiv:2503.05543 [paper]
GeoUni: A Unified Model for Generating Geometry Diagrams, Problems and Problem Solutions, arXiv:2504.10146 [paper]
Offline Training of Language Model Agents with Functions as Learnable Weights, ICML 2024 [paper] 🔵
Strategyllm: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving, NeurIPS 2024 [paper] 🔵
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical Problems, NeurIPS 2024 [paper] 🔵
Breaking Mental Set to Improve Reasoning through Diverse Multi-Agent Debate, ICLR 2025 [paper] 🔵

Methods - Training Stage

Pre-Training

GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning, Findings of ACL 2021 [paper]
UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression, EMNLP 2022 [paper]
A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from Diagram, IJCAI 2023 [paper]
Fuse, Reason and Verify: Geometry Problem Solving With Parsed Clauses From Diagram, arXiv:2407.07327 [paper]
A Symbolic Characters Aware Model for Solving Geometry Problems, MM 2023 [paper]
LANS: A Layout-Aware Neural Solver for Plane Geometry Problem, Findings of ACL 2024 [paper]
A Geometric Neural Solving Method Based on a Diagram Text Information Fusion Analysis, Sci. Rep. 2024 [paper]
SANS: Spatial-Aware Neural Solver for Plane Geometry Problem, ICPR 2024 [paper]
Measuring Mathematical Problem Solving With the MATH Dataset, NeurIPS 2021 [paper] 🔵
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning, NeurIPS 2024 MATH-AI Workshop [paper] 🔵
Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver, ICASSP 2025 [paper]
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-Training, ICLR 2025 [paper]
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine, ICLR 2025 [data]
GeoDANO: Geometric VLM with Domain Agnostic Vision Encoder, arXiv:2502.11360 [paper]
Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMs, arXiv:2501.06430 [paper]

Supervised Fine-Tuning

Synthetic Data Generator for Solving Korean Arithmetic Word Problem, Mathematics 2022 [paper] 🔵
GeomVerse: A Systematic Evaluation of Large Models for Geometric Reasoning, ICML 2024 AI4MATH Workshop [paper]
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine, ICLR 2025 [paper]
Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration, arXiv:2504.12773 [paper]
TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving, arXiv:2504.15780 [paper]
MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams, arXiv:2503.20745 [paper] [MATHGLANCE/GeoPeP data]
Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding, Findings of ACL 2025 [paper] [CogAlign-Probing/CogAlign-train data]
VisOnlyQA: Large Vision Language Models Still Struggle With Visual Perception of Geometric Information, arXiv:2412.00947 [paper] 🔵
MathScale: Scaling Instruction Tuning for Mathematical Reasoning, ICML 2024 [paper] 🔵
Key-Point-Driven Data Synthesis with Its Enhancement on Mathematical Reasoning, AAAI 2025 [paper] 🔵
R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models, arXiv:2410.17885 [paper]
VISTA: Visual Integrated System for Tailored Automation in Math Problem Generation Using LLM, PMLR 2025 [paper]
Feynman: Knowledge-Infused Diagramming Agent for Scaling Visual Reasoning Data, openreview 2025 [paper] 🔵
Proposing and Solving Olympiad Geometry with Guided Tree Search, arXiv:2412.10673 [paper]
An Augmented Benchmark Dataset for Geometric Question Answering through Dual Parallel Text Encoding, COLING 2022 [paper]
A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from Diagram, IJCAI 2023 [paper]
GAPS: Geometry-Aware Problem Solver, arXiv:2401.16287 [paper]
E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator, CVPR 2024 [paper]
FormalGeo: An Extensible Formalized Framework for Olympiad Geometric Problem Solving, arXiv:2310.18021 [paper] ❌
A Deep Reinforcement Learning Agent for Geometry Online Tutoring, KAIS 2023 [paper]
SANS: Spatial-Aware Neural Solver for Plane Geometry Problem, ICPR 2024 [paper]
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving, NeurIPS 2024 [paper] 🔵
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models, EMNLP Findings 2024 [paper] [MathV360K data] 🔵
GeoVQA: A Comprehensive Multimodal Geometry Dataset for Secondary Education, MIPR 2024 [paper]
Advancing Multimodal LLMs: A Focus on Geometry Problem Solving Reasoning and Sequential Scoring, MMASIA 2024 [paper]
Vision-Language Models Can Self-Improve Reasoning via Reflection, arXiv:2411.00855 [paper] 🔵
MultiLingPoT: Enhancing Mathematical Reasoning with Multilingual Program Fine-tuning, arXiv:2412.12609 [paper] 🔵
System-2 Mathematical Reasoning via Enriched Instruction Tuning, arXiv:2412.16964 [paper] 🔵
MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification, arXiv:2502.13383 [paper] 🔵
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model, ICLR 2025 [paper]
GeoCoder: Solving Geometry Problems by Generating Modular Code through Vision-Language Models, Findings of NAACL 2025 [paper]
M3CoT: A Novel Benchmark for Multi-Domain Multi-Step Multi-Modal Chain-of-Thought, ACL 2024 [paper] 🔵
URSA: Understanding and Verifying Chain-of-Thought Reasoning in Multimodal Mathematics, arXiv:2501.04686 [paper] 🔵
GNS: Solving Plane Geometry Problems by Neural-Symbolic Reasoning with Multi-Modal LLMs, AAAI 2025 [paper]
LLaVA-o1: Let Vision Language Models Reason Step-by-Step, arXiv:2411.10440 [paper] 🔵
AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning, arXiv:2411.11930 [paper] 🔵
RedStar: Does Scaling Long-CoT Data Unlock Better Slow-Reasoning Systems?, arXiv:2501.11284 [paper]
Virgo: A Preliminary Exploration on Reproducing o1-like MLLM, arXiv:2501.01904 [paper] 🔵
BBA: Bi-Modal Behavioral Alignment for Reasoning with Large Vision-Language Models, Findings of ACL 2024 [paper]
GeoGPT4V: Towards Geometric Multi-Modal Large Language Models With Geometric Image Generation, EMNLP 2024 [paper]
Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver, ICASSP 2025 [paper]
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-Training, ICLR 2025 [paper]
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement, arXiv:2504.07934 [paper] 🔵
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search, arXiv:2503.10582 [paper] 🔵

Reinforcement Learning

A Deep Reinforcement Learning Agent for Geometry Online Tutoring, KAIS 2023 [paper]
GeoDRL: A Self-Learning Framework for Geometry Problem Solving using Reinforcement Learning in Deductive Reasoning, ACL 2023 [paper]
FGeo-DRL: Deductive Reasoning for Geometric Problems Through Deep Reinforcement Learning, Symmetry 2024 [paper]
Hologram Reasoning for Solving Algebra Problems With Geometry Diagrams, arXiv:2408.10592 [paper]
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL, arXiv:2503.07536 [paper]
MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models, arXiv:2409.00147 [paper] 🔵
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine, ICLR 2025 [paper]
RedStar: Does Scaling Long-CoT Data Unlock Better Slow-Reasoning Systems?, arXiv:2501.11284 [paper]
Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding, Findings of ACL 2025 [paper]
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models, arXiv:2503.06749 [paper] 🔵
OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement, arXiv:2503.17352 [paper] 🔵
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning, arXiv:2503.20752 [paper] 🔵
Boosting the Generalization and Reasoning of Vision Language Models with Curriculum Reinforcement Learning, arXiv:2503.07065 [paper] 🔵
GeoUni: A Unified Model for Generating Geometry Diagrams, Problems and Problem Solutions, arXiv:2504.10146 [paper]
Boosting MLLM Reasoning with Text-Debiased Hint-GRPO, arXiv:2503.23905 [paper]
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models, arXiv:2504.11468 [paper] 🔵
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation, arXiv:2504.13055 [paper] 🔵
Reinforcement Learning for Reasoning in Large Language Models with One Training Example, arXiv:2504.20571 [paper] 🔵
GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning, arXiv:2504.02546 [paper]

Methods - Inference Stage

Test-Time Scaling

Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling, ICML 2024 [paper] 🔵
Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge, arXiv:2402.14310 [paper] 🔵
Null-Shot Prompting: Rethinking Prompting Large Language Models With Hallucination, EMNLP 2024 [paper] 🔵
Cumulative Reasoning with Large Language Models, arXiv:2308.04371 [paper] 🔵
Progressive-Hint Prompting Improves Reasoning in Large Language Models, ICML 2024 AI4MATH Workshop [paper] 🔵
MathSensei: Mathematical Reasoning with a Tool-Augmented Large Language Model, ICLR 2024 ME-FoMo Workshop [paper] 🔵
Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation, NeurIPS 2024 [paper] 🔵
SBSC: Step-By-Step Coding for Improving Mathematical Olympiad Performance, ICLR 2025 [paper] 🔵
Reason-and-Execute Prompting: Enhancing Multi-Modal Large Language Models for Solving Geometry Questions, MM 2024 [paper]
Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts, EMNLP 2023 [paper] 🔵
CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models, Findings of EMNLP 2023 [paper] 🔵
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving, ICLR 2024 [paper] 🔵
Evaluating Automated Geometric Problem Solving With Formal Language Generation on Large Multimodal Models, IEIR 2024 [paper]
Describe-then-Reason: Improving Multimodal Mathematical Reasoning Through Visual Comprehension Training, arXiv:2404.14604 [paper] 🔵
Beyond Captioning: Task-Specific Prompting for Improved VLM Performance in Mathematical Reasoning, arXiv:2410.05928 [paper]
Multi-Step Chain-of-Thought in Geometry Problem Solving, EIECS 2024 [paper]
Proposing and Solving Olympiad Geometry with Guided Tree Search, arXiv:2412.10673 [paper]
Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration, arXiv:2504.12773 [paper]
Solving Olympiad Geometry Without Human Demonstrations, Nature 2024 [paper]
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2, arXiv:2502.03544 [paper]
GeoDRL: A Self-Learning Framework for Geometry Problem Solving using Reinforcement Learning in Deductive Reasoning, ACL 2023 [paper]
GAPS: Geometry-Aware Problem Solver, arXiv:2401.16287 [paper]
LLaVA-o1: Let Vision Language Models Reason Step-by-Step, arXiv:2411.10440 [paper] 🔵
FGeo-DRL: Deductive Reasoning for Geometric Problems Through Deep Reinforcement Learning, Symmetry 2024 [paper]
MC-NEST--Enhancing Mathematical Reasoning in Large Language Models with a Monte Carlo Nash Equilibrium Self-Refine Tree, arXiv:2411.15645 [paper] 🔵
Mulberry: Empowering MLLM With o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search, arXiv:2412.18319 [paper] 🔵
Progressive Multimodal Reasoning via Active Retrieval, arXiv:2412.14835 [paper] 🔵
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking, arXiv:2502.02339 [paper] 🔵
VisuoThink: Empowering LVLM Reasoning with Multimodal Tree Search, arXiv:2504.09130 [paper]
Deliberate Reasoning for LLMs as Structure-Aware Planning with Accurate World Model, arXiv:2410.03136 [paper] 🔵
AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning, arXiv:2411.11930 [paper] 🔵
Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?, arXiv:2503.06252 [paper] 🔵
URSA: Understanding and Verifying Chain-of-Thought Reasoning in Multimodal Mathematics, arXiv:2501.04686 [paper] 🔵
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning, arXiv:2503.10291 [paper] 🔵
VILBENCH: A Suite for Vision-Language Process Reward Modeling, arXiv:2503.20271 [paper] 🔵
PRM-BAS: Enhancing Multimodal Reasoning through PRM-guided Beam Annealing Search, arXiv:2504.10222 [paper] 🔵
Learning From Correctness Without Prompting Makes LLM Efficient Reasoner, COLM 2024 [paper] 🔵
Generative Verifiers: Reward Modeling as Next-Token Prediction, ICLR 2025 [paper] 🔵
Vision-Language Models Can Self-Improve Reasoning via Reflection, arXiv:2411.00855 [paper] 🔵
Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning, arXiv:2504.09772 [paper] 🔵

Knowledge-Augmented Inference

Give me a Hint: Can LLMs Take a Hint to Solve Math Problems?, NeurIPS 2024 MATH-AI Workshop [paper] 🔵
Skills-in-Context: Unlocking Compositionality in Large Language Models, Findings of EMNLP 2024 [paper] 🔵
Curriculum Demonstration Selection for In-Context Learning, SAC 2025 [paper] 🔵
CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models, AAAI 2025 [paper] 🔵
All in an Aggregated Image for In-Image Learning, arXiv:2402.17971 [paper] 🔵
Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems With Meta In-Context Learning, LGM3A 2024 [paper]
Advancing Multimodal LLMs: A Focus on Geometry Problem Solving Reasoning and Sequential Scoring, MMASIA 2024 [paper]
GeoCoder: Solving Geometry Problems by Generating Modular Code through Vision-Language Models, Findings of NAACL 2025 [paper]
Enhancing LLM Reasoning via Vision-Augmented Prompting, NeurIPS 2024 [paper]
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models, NeurIPS 2024 [paper] 🔵
Interactive Sketchpad: A Multimodal Tutoring System for Collaborative, Visual Problem-Solving, arXiv:2503.16434 [paper] 🔵
CogCom: A Visual Language Model with Chain-of-Manipulations Reasoning, ICLR 2025 [paper] 🔵
VisuoThink: Empowering LVLM Reasoning with Multimodal Tree Search, arXiv:2504.09130 [paper]
Learning to Plan by Updating Natural Language, Findings of EMNLP 2024 [paper] 🔵
Explicit Memory Learning with Expectation Maximization, EMNLP 2024 [paper] 🔵

Related Surveys

The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers, TPAMI 2019 [paper] 🔵 ❌
Deep Learning in Automatic Math Word Problem Solvers, AI in Learning: Designing the Future 2022 [article] 🔵
Evolution of Automated Deduction and Dynamic Constructions in Geometry, Mathematics Education in the Age of Artificial Intelligence: How Artificial Intelligence can Serve Mathematical Human Learning 2022 [article] ❌
A Survey of Deep Learning for Mathematical Reasoning, ACL 2023 [paper] 🔵
Systematic Literature Review: Application of Dynamic Geometry Software to Improve Mathematical Problem-Solving Skills, Mathline: Jurnal Matematika Dan Pendidikan Matematika 2023 [paper] ❌
A Survey of Reasoning with Foundation Models, arXiv:2312.11562 [paper] 🔵
A Survey of Reasoning with Foundation Models: Concepts, Methodologies, and Outlook, ACM Comput. Surv. 2023 [paper] 🔵
Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges, arXiv:2401.08664 [paper] 🔵
Large Language Models for Mathematical Reasoning: Progresses and Challenges, EACL 2024 [paper] 🔵
A Survey on Deep Learning for Theorem Proving, COLM 2024 [paper] 🔵
A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery, EMNLP 2024 [paper] 🔵
Towards Robust Automated Math Problem Solving: A Survey of Statistical and Deep Learning Approaches, Evol. Intell. 2024 [paper] 🔵
A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges, Findings of ACL 2025 [paper] 🔵
Decoding Math: A Review of Datasets Shaping AI-Driven Mathematical Reasoning, JIM 2025 [paper] 🔵
Visual Large Language Models for Generalized and Specialized Application, arXiv:2501.02765 [paper] 🔵
From System 1 to System 2: A Survey of Reasoning Large Language Models, arXiv:2502.17419 [paper] 🔵
Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents, arXiv:2503.24047 [paper] 🔵

Years

2014

Diagram Understanding in Geometry Questions, AAAI 2014 [paper] ❌
Synthesis of Geometry Proof Problems, AAAI 2014 [paper] ❌
Plane Geometry Figure Retrieval Based on Bilayer Geometric Attributed Graph Matching, ICPR 2014 [paper] 🔺 ❌
Plane Geometry Figure Retrieval with Bag of Shapes, IAPR 2014 DAS Workshop [paper] 🔺 ❌

2015

Solving Geometry Problems: Combining Text and Diagram Interpretation, EMNLP 2015 [paper] ❌
Automatic Reconstruction of Plane Geometry Figures in Documents, EITT 2015 [paper] 🔺 ❌
Overlapped-Triangle Analysis with Hierarchical Ranking of Dominance, ICDAR 2015 [paper] ❌
Solid Geometric Object Reconstruction from Single Line Drawing Image, GRAPP 2015 [paper] 🔺 ❌

2016

Plane Geometry Diagram Retrieval by Using Hierarchical Searching Strategy, ICIMCS 2016 [paper] 🔺 ❌
Analysis of Stroke Intersection for Overlapping PGF Elements, IAPR 2016 DAS Workshop [paper] 🔺 ❌
AnalyticalInk: An Interactive Learning Environment for Math Word Problem Solving, IUI 2016 [paper] ❌
My Computer Is an Honor Student — but How Intelligent Is It? Standardized Tests as a Measure of AI, AIMA 2016 [paper] 🔵 ❌
Improving PGF retrieval effectiveness with active learning, ICPR 2016 [paper] 🔺 ❌
Improving Retrieval of Plane Geometry Figure with Learning to Rank, PTRL 2016 [paper] 🔺 ❌
Recovering Solid Geometric Object from Single Line Drawing Image, Multimed Tools Appl 2016 [paper] 🔺 ❌
An Example-based Approach to 3D Man-made Object Reconstruction from Line Drawings, Pattern Recogn 2016 [paper] 🔺 ❌
Context-aware Geometric Object Reconstruction for Mobile Education, MM 2016 [paper] 🔺 ❌

2017

Semantic Parsing of Pre-University Math Problems, ACL 2017 [paper] ❌
Synthesis of Solutions for Shaded Area Geometry Problems, FLAIRS 2017 [paper] ❌
From Textbooks to Knowledge: A Case Study in Harvesting Axiomatic Knowledge from Textbooks to Solve Geometry Problems, EMNLP 2017 [paper] ❌
Learning to Solve Geometry Problems from Natural Language Demonstrations in Textbooks, *SEM 2017 [paper] ❌
Understanding Explicit Arithmetic Word Problems and Explicit Plane Geometry Problems Using Syntax-Semantics Models, IALP 2017 [paper] ❌
Beyond Sentential Semantic Parsing: Tackling the Math SAT with a Cascade of Tree Transducers, EMNLP 2017 [paper] 🔵 ❌
Retrieving Geometric Information from Images: The Case of Hand-Drawn Diagrams, KDD 2017 [paper] ❌
Understanding Plane Geometry Problems by Integrating Relations Extracted from Text and Diagram, PSIVT 2017 [paper] ❌
Automatic Assessment of Student Answers for Geometric Theorem Proving Questions, MERCon 2017 [paper] ❌

2018

Automatic Understanding and Formalization of Natural Language Geometry Problems Using Syntax-Semantics Models, IJICIC 2018 [paper] ❌
Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples, ACL 2018 [paper]

2019

A Sharing Framework for Solving Explicit Arithmetic Word Problems and Proving Plane Geometry Theorems, IJPRAI 2019 [paper] ❌
Discourse in Multimedia: A Case Study in Extracting Geometry Knowledge from Textbooks, CL 2019 [paper] ❌
Automatically Proving Plane Geometry Theorems Stated by Text and Diagram, IJPRAI 2019 [paper] ❌
Automatic Understanding and Formalization of Plane Geometry Proving Problems in Natural Language: A Supervised Approach, IJAIT 2019 [paper] ❌
A Neural Semantic Parser for Math Problems Incorporating Multi-Sentence Information, TALLIP 2019 [paper] 🔵
AiFu at SemEval-2019 Task 10: A Symbolic and Sub-symbolic Integrated System for SAT Math Question Answering, SemEval 2019 [paper] 🔵
Robot for Mathematics College Entrance Examination, ATCM 2019 [paper] 🔵
SemEval-2019 Task 10: Math Question Answering, SemEval 2019 [paper] 🔵
The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers, TPAMI 2019 [paper] 🔵 ❌
ProblemSolver at SemEval-2019 Task 10: Sequence-to-Sequence Learning and Expression Trees, SemEval 2019 [paper] 🔵

2020

Two-step Memory Networks for Deep Semantic Parsing of Geometry Word Problems, SOFSEM 2020 [paper]
A Novel Geometric Information Retrieval Tool for Images of Geometric Diagrams, ICISE-IE 2020 [paper]
Applied Aspects of the Integrated Problem Solving System with Natural Language Interface, Inforino 2020 [paper] ❌
Design an Intelligent Problem Solver in Geometry based on Knowledge Model of Relations, Engineering Letters 2020 [paper] ❌
Ontology-Controlled Geometric Solver, RCAI 2020 [paper] ❌
2D Geometric Shapes Dataset – For Machine Learning and Pattern Recognition, Data in Brief 2020 [paper]

2021

Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning, ACL 2021 [paper]
GeoRE: A Relation Extraction Dataset for Chinese Geometry Problems, NeurIPS 2021 MATHAI4ED Workshop [paper]
GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning, Findings of ACL 2021 [paper]
Semantic Parsing of Geometry Statements Using Supervised Machine Learning on Synthetic Data, NatFoM 2021 CICM Workshop [paper]
Learning to Solve Geometric Construction Problems from Images, CICM 2021 [paper] 🔺
Automated Generation of Illustrations for Synthetic Geometry Proofs, ADG 2021 [paper] 🔺 ❌
Solving Solid Geometric Calculation Problems in Text, TALE 2021 [paper] ❌
Automated Discovery of Geometrical Theorems in GeoGebra, ThEdu 2021 [paper] ❌
Automatically Building Diagrams for Olympiad Geometry Problems, CADE 2021 [paper] 🔺 ❌
Linguistic Processor Integration for Solving Planimetric Problems, IJCINI 2021 [paper] ❌
A Paradigm of Diagram Understanding in Problem Solving, TALE 2021 [paper] ❌
Sequence to General Tree Knowledge-Guided Geometry Word Problem Solving, ACL-IJCNLP 2021 [paper]
Proving Geometric Problem by Adding Auxiliary Lines-Based on Hypothetical Test, AIET 2021 [paper] ❌
Solving Shaded Area Problems by Constructing Equations, AIET 2021 [paper] ❌
Cognitive Patterns for Semantic Presentation of Natural-Language Descriptions of Well-Formalizable Problems, RCAI 2021 [paper] ❌
Measuring Mathematical Problem Solving With the MATH Dataset, NeurIPS 2021 [paper] 🔵

2022

A Graph Convolutional Network Feature Learning Framework for Interpretable Geometry Problem Solving, IEIR 2022 [paper]
An Augmented Benchmark Dataset for Geometric Question Answering through Dual Parallel Text Encoding, COLING 2022 [paper]
Plane Geometry Diagram Parsing, IJCAI 2022 [paper]
Learning to Understand Plane Geometry Diagram, NeurIPS 2022 MATH-AI Workshop [paper]
PGDP5K: A Diagram Parsing Dataset for Plane Geometry Problems, ICPR 2022 [paper]
UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression, EMNLP 2022 [paper]
A Novel Geometry Problem Understanding Method based on Uniform Vectorized Syntax-Semantics Model, IEIR 2022 [paper]
Geoclidean: Few-Shot Generalization in Euclidean Geometry, NeurIPS 2022 [paper]
A Method for Expanding Predicates and Rules in Automated Geometry Reasoning System, Mathematics 2022 [paper] ❌
Research on Geometry Problem Text Understanding Based on Bidirectional LSTM-CRF, ICDH 2022 [paper]
Synthetic Data Generator for Solving Korean Arithmetic Word Problem, Mathematics 2022 [paper] 🔵
Usage of Stacked Long Short-Term Memory for Recognition of 3D Analytic Geometry Elements, ICAART 2022 [paper]
Complex Modeling of Inductive and Deductive Reasoning by the Example of a Planimetric Problem Solver, IITI 2022 [paper] ❌
Natural Language Processing and Functioning Ontological Solver with Visualization in an Integrated System, IntelliSys 2022 [paper] ❌
Beyond the Imitation Game: Quantifying and Extrapolating the Capabilities of Language Models, TMLR 2022 [paper] 🔵
Deep Learning in Automatic Math Word Problem Solvers, AI in Learning: Designing the Future 2022 [article] 🔵
Evolution of Automated Deduction and Dynamic Constructions in Geometry, Mathematics Education in the Age of Artificial Intelligence: How Artificial Intelligence can Serve Mathematical Human Learning 2022 [article] ❌
Supervised Learning Use to Acquire Knowledge from 2D Analytic Geometry Problems, ACIIDS 2022 [paper] ❌
NUMGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks, ACL 2022 [paper] 🔵
Lila: A Unified Benchmark for Mathematical Reasoning, EMNLP 2022 [paper] 🔵

2023

EuclidNet: Deep Visual Reasoning for Constructible Problems in Geometry, AIML 2023 [paper] 🔺
Solving Geometry Problems via Feature Learning and Contrastive Learning of Multimodal Data, CMES 2023 [paper]
A Deep Reinforcement Learning Agent for Geometry Online Tutoring, KAIS 2023 [paper]
GeoDRL: A Self-Learning Framework for Geometry Problem Solving using Reinforcement Learning in Deductive Reasoning, ACL 2023 [paper]
Solving Algebraic Problems with Geometry Diagrams Using Syntax-Semantics Diagram Understanding, Computers, Materials & Continua 2023 [paper] ❌
A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from Diagram, IJCAI 2023 [paper]
UniMath: A Foundational and Multimodal Mathematical Reasoner, EMNLP 2023 [paper] 🔵
Geometry Problem Solving Based on Counter-factual Evolutionary Reasoning, CASE 2023 [paper]
Interpretable Geometry Problem Solving Using Improved RetinaNet and Graph Convolutional Network, Electronics 2023 [paper]
A Symbolic Characters Aware Model for Solving Geometry Problems, MM 2023 [paper]
FormalGeo: An Extensible Formalized Framework for Olympiad Geometric Problem Solving, arXiv:2310.18021 [paper] ❌
A Precise Text-to-Diagram Generation Method for Elementary Geometry, ICCWAMTIP 2023 [paper] 🔺
The Geometric Neural Solution Combined with Text Diagram Parsing, IEIR 2023 [paper]
SUFFI-GPSC: Sufficient Geometry Problem Solution Checking with Symbolic Computation and Logical Reasoning, ICCWAMTIP 2023 [paper]
Extracting structured information from the textual description of geometry word problems, NLPIR 2023 [paper] ❌
Conic10K: A Challenging Math Problem Understanding and Reasoning Dataset, Findings of EMNLP 2023 [paper]
Automated Evaluation of Student Answers for Geometric Questions Based on the Theorem 'Angles on a Straight Line Add to 180±', SLAAI-ICAI 2023 [paper]
Estimating Answer Strategies using Online Handwritten Data: A Study using Geometry Problems, ICETC 2023 [paper]
It Ain’t Over: A Multi-Aspect Diverse Math Word Problem Dataset, EMNLP 2023 [paper] 🔵
Visual Amplification of Geometry Problems: A Method for Synchronized Highlighting in Text and Diagrams, IEIR 2023 [paper]
Systematic Literature Review: Application of Dynamic Geometry Software to Improve Mathematical Problem-solving Skills, JMPM 2023 [paper] ❌
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them, Findings of ACL 2023 [paper]
M3exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models, NeurIPS 2023 [paper] 🔵
TheoremQA: A Theorem-driven Question Answering Dataset, EMNLP 2023 [paper] 🔵
CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models, Findings of EMNLP 2023 [paper] 🔵
Cumulative Reasoning with Large Language Models, arXiv:2308.04371 [paper] 🔵
Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts, EMNLP 2023 [paper] 🔵
Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges, arXiv:2401.08664 [paper] 🔵
A Survey of Deep Learning for Mathematical Reasoning, ACL 2023 [paper] 🔵
Systematic Literature Review: Application of Dynamic Geometry Software to Improve Mathematical Problem-Solving Skills, Mathline: Jurnal Matematika Dan Pendidikan Matematika 2023 [paper] ❌
A Survey of Reasoning with Foundation Models, arXiv:2312.11562 [paper] 🔵
A Survey of Reasoning with Foundation Models: Concepts, Methodologies, and Outlook, ACM Comput. Surv. 2023 [paper] 🔵
Geometry Problem Solving Based on Deep Learning, CSMIS 2023 [paper]

2024

LANS: A Layout-Aware Neural Solver for Plane Geometry Problem, Findings of ACL 2024 [paper]
GeomVerse: A Systematic Evaluation of Large Models for Geometric Reasoning, ICML 2024 AI4MATH Workshop [paper]
Solving Olympiad Geometry Without Human Demonstrations, Nature 2024 [paper]
GAPS: Geometry-Aware Problem Solver, arXiv:2401.16287 [paper]
E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator, CVPR 2024 [paper]
Beyond Lines and Circles Unveiling the Geometric Reasoning Gap in Large Language Models, Findings of EMNLP 2024 [paper] 🔺
FGeo-TP: A Language Model-Enhanced Solver for Geometry Problems, Symmetry 2024 [paper]
FGeo-DRL: Deductive Reasoning for Geometric Problems Through Deep Reinforcement Learning, Symmetry 2024 [paper]
FGeo-SSS: A Search-Based Symbolic Solver for Human-Like Automated Geometric Reasoning, Symmetry 2024 [paper] ❌
FGeo-HyperGNet: Geometric Problem Solving Integrating Formal Symbolic System and Hypergraph Neural Network, arXiv:2402.11461 [paper]
OlympiadBench: A Challenging Benchmark for Promoting AGI With Olympiad-Level Bilingual Multimodal Scientific Problems, ACL 2024 [paper] 🔵
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts, ICLR 2024 [paper] 🔵
MathVerse: Does Your Multi-Modal LLM Truly See the Diagrams in Visual Math Problems?, ECCV 2024 [paper] 🔵
Measuring Multimodal Mathematical Reasoning With MATH-Vision Dataset, NeurIPS 2024 [paper] 🔵
MM-MATH: Advancing Multimodal Math Evaluation With Process Evaluation and Fine-Grained Classification, Findings of EMNLP 2024 [paper] 🔵
Wu’s Method Boosts Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry, NeurIPS 2024 MATH-AI Workshop [paper]
GOLD: Geometry Problem Solver With Natural Language Description, Findings of NAACL 2024 [paper]
Learning to Solve Geometry Problems via Simulating Human Dual-Reasoning Process, IJCAI 2024 [paper]
Autoformalizing Euclidean Geometry, ICML 2024 [paper] 🔺
GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving, Findings of ACL 2024 [paper]
GeoGPT4V: Towards Geometric Multi-Modal Large Language Models With Geometric Image Generation, EMNLP 2024 [paper]
Figuring Figures: An Assessment of Large Language Models on Different Modalities of Math Word Problems, ICMLT 2024 [paper]
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models, EMNLP Findings 2024 [paper] 🔵
We-Math: Does Your Large Multimodal Model Achieve Human-Like Mathematical Reasoning?, arXiv:2407.01284 [paper] 🔵
Fuse, Reason and Verify: Geometry Problem Solving With Parsed Clauses From Diagram, arXiv:2407.07327 [paper]
Is Your Model Really a Good Math Reasoner? Evaluating Mathematical Reasoning With Checklist, arXiv:2407.08733 [paper]
Hologram Reasoning for Solving Algebra Problems With Geometry Diagrams, arXiv:2408.10592 [paper]
EAGLE: Elevating Geometric Reasoning Through LLM-Empowered Visual Instruction Tuning, arXiv:2408.11397 [paper]
Tangram: A Challenging Benchmark for Geometric Element Recognizing, arXiv:2408.13854 [paper]
Leveraging Two-Level Deep Learning Classifers for 2D Shape Recognition to Automatically Solve Geometry Math Word Problems, PAA 2024 [paper]
MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models, arXiv:2409.00147 [paper] 🔵
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning, NeurIPS 2024 MATH-AI Workshop [paper] 🔵
MathGLM-Vision: Solving Mathematical Problems With Multi-Modal Large Language Model, arXiv:2409.13729 [paper] 🔵
Enhancing Geometry Problem Solving With Attention Mechanism and Super-Resolution, ICBASE 2024 [paper]
Automated Generation of Geometry Proof Problems Based on Point Geometry Identity, Journal of Automated Reasoning 2024 [paper] ❌
Geo-Qwen: A Geometry Problem-Solving Method Based on Generative Large Language Models and Heuristic Reasoning, ICCWAMTIP 2024 [paper]
Formal Representation and Solution of Plane Geometric Problems, NeurIPS 2024 MATH-AI Workshop [paper]
GeoVQA: A Comprehensive Multimodal Geometry Dataset for Secondary Education, MIPR 2024 [paper]
Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems With Meta In-Context Learning, LGM3A 2024 [paper]
Reason-and-Execute Prompting: Enhancing Multi-Modal Large Language Models for Solving Geometry Questions, MM 2024 [paper]
R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models, arXiv:2410.17885 [paper]
LLaVA-o1: Let Vision Language Models Reason Step-by-Step, arXiv:2411.10440 [paper] 🔵
Vision-Language Models Can Self-Improve Reasoning via Reflection, arXiv:2411.00855 [paper] 🔵
A Geometric Neural Solving Method Based on a Diagram Text Information Fusion Analysis, Sci. Rep. 2024 [paper]
Slow Perception: Let's Perceive Geometric Figures Step-by-Step, arXiv:2412.20631 [paper]
Maths: Multimodal Transformer-Based Human-Readable Solver, ICME 2024 [paper]
Automatic Extraction of Structured Information from Elementary Level Geometry Questions into Logic Forms, Multimed Tools Appl 2024 [paper]
Improving Multimodal LLMs Ability in Geometry Problem Solving, Reasoning, and Multistep Scoring, arXiv:2412.00846 [paper]
Advancing Multimodal LLMs: A Focus on Geometry Problem Solving Reasoning and Sequential Scoring, MMASIA 2024 [paper]
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models, NeurIPS 2024 [paper] 🔵
2D Shape Detection for Solving Geometry Word Problems, IETE J. Res. 2024 [paper] ❌
VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning, arXiv:2410.22995 [paper] 🔵
Euclid: Supercharging Multimodal LLMs With Synthetic High-Fidelity Visual Descriptions, arXiv:2412.08737 [paper]
AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning, arXiv:2411.11930 [paper] 🔵
Mulberry: Empowering MLLM With o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search, arXiv:2412.18319 [paper] 🔵
What is the True Performance of Large Multimodal Models in Visual Context-Based Mathematical Reasoning? An Analysis of Multiple Datasets and Future Research Directions, ICTC 2024 [paper]
GePBench: Evaluating Fundamental Geometric Perception for Multimodal Large Language Models, arXiv:2412.21036 [paper]
DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students’ Hand-Drawn Math Images, NeurIPS 2024 MATH-AI Workshop [paper] 🔵
An Enhanced Relation-Flow Algorithm for Solving Number Line Problems, IEIR 2024 [paper] ❌
Describe-then-Reason: Improving Multimodal Mathematical Reasoning Through Visual Comprehension Training, arXiv:2404.14604 [paper] 🔵
Enhancing LLM Reasoning via Vision-Augmented Prompting, NeurIPS 2024 [paper]
Evaluating Automated Geometric Problem Solving With Formal Language Generation on Large Multimodal Models, IEIR 2024 [paper]
Multi-Step Chain-of-Thought in Geometry Problem Solving, EIECS 2024 [paper]
VisOnlyQA: Large Vision Language Models Still Struggle With Visual Perception of Geometric Information, arXiv:2412.00947 [paper] 🔵
CMM-Math: A Chinese Multimodal Math Dataset to Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models, arXiv:2409.02834 [paper] 🔵
All in an Aggregated Image for In-Image Learning, arXiv:2402.17971 [paper] 🔵
CurveML: A Benchmark for Evaluating and Training Learning-Based Methods of Classification, Recognition, and Fitting of Plane Curves, Visual Comput 2024 [paper]
Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models, ACL 2024 [paper] 🔵
MathScape: Evaluating MLLMs in Multimodal Math Scenarios Through a Hierarchical Benchmark, arXiv:2408.07543 [paper] 🔵
VisScience: An Extensive Benchmark for Evaluating K12 Educational Multi-Modal Scientific Reasoning, arXiv:2409.13730 [paper] 🔵
Progressive Multimodal Reasoning via Active Retrieval, arXiv:2412.14835 [paper] 🔵
Decomposing Complex Visual Comprehension Into Atomic Visual Skills for Vision Language Models, NeurIPS 2024 MATH-AI Workshop [paper] 🔵
ReMI: A Dataset for Reasoning With Multiple Images, NeurIPS 2024 [paper] 🔵
M3GIA: A Cognition-Inspired Multilingual and Multimodal General Intelligence Ability Benchmark, arXiv:2406.05343 [paper] 🔵
Mathematical Problem Solving in Arabic: Assessing Large Language Models, Procedia Comput. Sci. 2024 [paper] 🔵
M3CoT: A Novel Benchmark for Multi-Domain Multi-Step Multi-Modal Chain-of-Thought, ACL 2024 [paper] 🔵
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data, arXiv:2406.18321 [paper] 🔵
PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition, NeurIPS 2024 [paper] 🔵
From Blind Solvers to Logical Thinkers: Benchmarking LLMs’ Logical Integrity on Faulty Mathematical Problems, arXiv:2410.18921 [paper] 🔵
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models, Findings of ACL 2024 [paper] 🔵
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap, arXiv:2402.19450 [paper] 🔵
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark, Findings of ACL 2024 [paper] 🔵
HARP: A Challenging Human-Annotated Math Reasoning Benchmark, arXiv:2412.08819 [paper] 🔵
Progressive-Hint Prompting Improves Reasoning in Large Language Models, ICML 2024 AI4MATH Workshop [paper] 🔵
Learning to Plan by Updating Natural Language, Findings of EMNLP 2024 [paper] 🔵
Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling, ICML 2024 [paper] 🔵
Skills-in-Context: Unlocking Compositionality in Large Language Models, Findings of EMNLP 2024 [paper] 🔵
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving, ICLR 2024 [paper] 🔵
Can Generative AI Solve Geometry Problems? Strengths and Weaknesses of LLMs for Geometric Reasoning in Spanish, IJIMAI 2024 [paper]
Explicit Memory Learning with Expectation Maximization, EMNLP 2024 [paper] 🔵
MathSensei: Mathematical Reasoning with a Tool-Augmented Large Language Model, ICLR 2024 ME-FoMo Workshop [paper] 🔵
Null-Shot Prompting: Rethinking Prompting Large Language Models With Hallucination, EMNLP 2024 [paper] 🔵
Strategyllm: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving, NeurIPS 2024 [paper] 🔵
BBA: Bi-Modal Behavioral Alignment for Reasoning with Large Vision-Language Models, Findings of ACL 2024 [paper]
Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge, arXiv:2402.14310 [paper] 🔵
MathScale: Scaling Instruction Tuning for Mathematical Reasoning, ICML 2024 [paper] 🔵
Learning From Correctness Without Prompting Makes LLM Efficient Reasoner, COLM 2024 [paper] 🔵
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical Problems, NeurIPS 2024 [paper] 🔵
Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation, NeurIPS 2024 [paper] 🔵
Deliberate Reasoning for LLMs as Structure-Aware Planning with Accurate World Model, arXiv:2410.03136 [paper] 🔵
Give me a Hint: Can LLMs Take a Hint to Solve Math Problems?, NeurIPS 2024 MATH-AI Workshop [paper] 🔵
MC-NEST--Enhancing Mathematical Reasoning in Large Language Models with a Monte Carlo Nash Equilibrium Self-Refine Tree, arXiv:2411.15645 [paper] 🔵
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving, NeurIPS 2024 [paper] 🔵
Offline Training of Language Model Agents with Functions as Learnable Weights, ICML 2024 [paper] 🔵
UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts, arXiv:2411.07240 [paper] 🔵
MultiLingPoT: Enhancing Mathematical Reasoning with Multilingual Program Fine-tuning, arXiv:2412.12609 [paper] 🔵
System-2 Mathematical Reasoning via Enriched Instruction Tuning, arXiv:2412.16964 [paper] 🔵
Beyond Captioning: Task-Specific Prompting for Improved VLM Performance in Mathematical Reasoning, arXiv:2410.05928 [paper]
Data for Mathematical Copilots: Better Ways of Presenting Proofs for Machine Learning, arXiv:2412.15184 [paper] 🔵
Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges, arXiv:2401.08664 [paper] 🔵
Large Language Models for Mathematical Reasoning: Progresses and Challenges, EACL 2024 [paper] 🔵
A Survey on Deep Learning for Theorem Proving, COLM 2024 [paper] 🔵
A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery, EMNLP 2024 [paper] 🔵
Towards Robust Automated Math Problem Solving: A Survey of Statistical and Deep Learning Approaches, Evol. Intell. 2024 [paper] 🔵
Proposing and Solving Olympiad Geometry with Guided Tree Search, arXiv:2412.10673 [paper]
SANS: Spatial-Aware Neural Solver for Plane Geometry Problem, ICPR 2024 [paper]

2025

AutoGeo: Automating Geometric Image Dataset Creation for Enhanced Geometry Understanding, IEEE Trans. Multimedia 2025 [paper]
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model, ICLR 2025 [paper]
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine, ICLR 2025 [paper]
DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models, ICLR 2025 [paper] 🔵
Do Large Language Models Truly Understand Geometric Structures?, ICLR 2025 [paper]
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-Training, ICLR 2025 [paper]
Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver, ICASSP 2025 [paper]
A Knowledge and Semantic Fusion Method for Automatic Geometry Problem Understanding, Appl. Sci. 2025 [paper]
ElementaryCQT: A New Dataset and Its Deep Learning Analysis for 2D Geometric Shape Recognition, SN Comput. Sci. 2025 [paper]
Exploration of Formalization Techniques for Geometric Entities in Planar Geometry Proposition Texts, JAIP 2025 [paper]
FGeo-Parser: Autoformalization and Solution of Plane Geometric Problems, Symmetry 2025 [paper]
GeoDANO: Geometric VLM with Domain Agnostic Vision Encoder, arXiv:2502.11360 [paper]
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning, AAAI 2025 [paper] 🔵
Multimodal Large Language Models for High School Mathematical Reasoning: Impact of Input Modality and Artifacts, Authorea Preprints 2025 [paper] 🔵
URSA: Understanding and Verifying Chain-of-Thought Reasoning in Multimodal Mathematics, arXiv:2501.04686 [paper] 🔵
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models, arXiv:2503.06749 [paper] 🔵
VISTA: Visual Integrated System for Tailored Automation in Math Problem Generation Using LLM, PMLR 2025 [paper]
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL, arXiv:2503.07536 [paper]
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning, arXiv:2503.10291 [paper] 🔵
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search, arXiv:2503.10582 [paper] 🔵
VisNumBench: Evaluating Number Sense of Multimodal Large Language Models, arXiv:2503.14939 [paper] 🔵
Interactive Sketchpad: A Multimodal Tutoring System for Collaborative, Visual Problem-Solving, arXiv:2503.16434 [paper] 🔵
OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement, arXiv:2503.17352 [paper] 🔵
MathAgent: Leveraging a Mixture-of-Math-Agent Framework for Real-World Multimodal Mathematical Error Detection, arXiv:2503.18132 [paper] 🔵
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning, arXiv:2503.20752 [paper] 🔵
VILBENCH: A Suite for Vision-Language Process Reward Modeling, arXiv:2503.20271 [paper] 🔵
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement, arXiv:2504.07934 [paper] 🔵
VisuoThink: Empowering LVLM Reasoning with Multimodal Tree Search, arXiv:2504.09130 [paper]
GNS: Solving Plane Geometry Problems by Neural-Symbolic Reasoning with Multi-Modal LLMs, AAAI 2025 [paper]
Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration, arXiv:2504.12773 [paper]
GeoSense: Evaluating Identification and Application of Geometric Principles in Multimodal Reasoning, arXiv:2504.12597 [paper]
CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models, AAAI 2025 [paper] 🔵
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2, arXiv:2502.03544 [paper]
Feynman: Knowledge-Infused Diagramming Agent for Scaling Visual Reasoning Data, openreview 2025 [paper] 🔵
GeoUni: A Unified Model for Generating Geometry Diagrams, Problems and Problem Solutions, arXiv:2504.10146 [paper]
MagicGeo: Training-Free Text-Guided Geometric Diagram Generation, arXiv:2502.13855 [paper] 🔺
MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams, arXiv:2503.20745 [paper]
Virgo: A Preliminary Exploration on Reproducing o1-like MLLM, arXiv:2501.01904 [paper] 🔵
Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMs, arXiv:2501.06430 [paper]
MV-MATH: Evaluating Multimodal Math Reasoning in Multi-Visual Contexts, arXiv:2502.20808 [paper] 🔵
Pi-GPS: Enhancing Geometry Problem Solving by Unleashing the Power of Diagrammatic Information, arXiv:2503.05543 [paper]
MathFlow: Enhancing the Perceptual Flow of MLLMs for Visual Mathematical Problems, arXiv:2503.16549 [paper] 🔵
RedStar: Does Scaling Long-CoT Data Unlock Better Slow-Reasoning Systems?, arXiv:2501.11284 [paper]
Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?, arXiv:2503.06252 [paper] 🔵
GeoCoder: Solving Geometry Problems by Generating Modular Code through Vision-Language Models, Findings of NAACL 2025 [paper]
DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students’ Hand-Drawn Math Images, NAACL 2025 [paper] 🔵
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency, arXiv:2504.18589 [paper] 🔵
CogCom: A Visual Language Model with Chain-of-Manipulations Reasoning, ICLR 2025 [paper] 🔵
CMMaTH: A Chinese Multi-modal Math Skill Evaluation Benchmark for Foundation Models, COLING 2025 [paper] 🔵
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking, arXiv:2502.02339 [paper] 🔵
Forgotten Polygons: Multimodal Large Language Models are Shape-Blind, arXiv:2502.15969 [paper]
Boosting the Generalization and Reasoning of Vision Language Models with Curriculum Reinforcement Learning, arXiv:2503.07065 [paper] 🔵
Boosting MLLM Reasoning with Text-Debiased Hint-GRPO, arXiv:2503.23905 [paper]
PRM-BAS: Enhancing Multimodal Reasoning through PRM-guided Beam Annealing Search, arXiv:2504.10222 [paper] 🔵
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models, arXiv:2504.11468 [paper] 🔵
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation, arXiv:2504.13055 [paper] 🔵
TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving, arXiv:2504.15780 [paper]
On The Potential of Using Generative Artificial Intelligence for Geometry Educational Activities, hal 2025 [paper]
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark for Large Language Models, ICLR 2025 [paper] 🔵
MathConstruct: Challenging LLM Reasoning with Constructive Proofs, ICLR 2025 VerifAI Workshop [paper] 🔵
TMATH: A Dataset for Evaluating Large Language Models in Generating Educational Hints for Math Word Problems, COLING 2025 [paper] 🔵
MathClean: A Benchmark for Synthetic Mathematical Data Cleaning, arXiv:2502.19508 [paper] 🔵
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models, arXiv:2503.21380 [paper] 🔵
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?, arXiv:2504.00509 [paper] 🔵
PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts, arXiv:2504.18428 [paper] 🔵
LLMs Are Not Intelligent Thinkers: Introducing Mathematical Topic Tree Benchmark for Comprehensive Evaluation of LLMs, NAACL 2025 [paper] 🔵
Who's the MVP? A Game-Theoretic Evaluation Benchmark for Modular Attribution in LLM Agents, arXiv:2502.00510 [paper] 🔵
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations, ICLR 2025 LLM Reason&Plan Workshop [paper] 🔵
Generative Verifiers: Reward Modeling as Next-Token Prediction, ICLR 2025 [paper] 🔵
Breaking Mental Set to Improve Reasoning through Diverse Multi-Agent Debate, ICLR 2025 [paper] 🔵
SBSC: Step-By-Step Coding for Improving Mathematical Olympiad Performance, ICLR 2025 [paper] 🔵
Curriculum Demonstration Selection for In-Context Learning, SAC 2025 [paper] 🔵
GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning, arXiv:2504.02546 [paper]
Reinforcement Learning for Reasoning in Large Language Models with One Training Example, arXiv:2504.20571 [paper] 🔵
Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding, Findings of ACL 2025 [paper]
MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification, arXiv:2502.13383 [paper] 🔵
Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning, arXiv:2504.09772 [paper] 🔵
Climbing the Ladder of Reasoning: What LLMs Can-and Still Can't-Solve after SFT?, arXiv:2504.11741 [paper] 🔵
Key-Point-Driven Data Synthesis with Its Enhancement on Mathematical Reasoning, AAAI 2025 [paper] 🔵
A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges, Findings of ACL 2025 [paper] 🔵
Decoding Math: A Review of Datasets Shaping AI-Driven Mathematical Reasoning, JIM 2025 [paper] 🔵
Visual Large Language Models for Generalized and Specialized Application, arXiv:2501.02765 [paper] 🔵
From System 1 to System 2: A Survey of Reasoning Large Language Models, arXiv:2502.17419 [paper] 🔵
Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents, arXiv:2503.24047 [paper] 🔵
Plane Geometry Problem Solving with Multi-modal Reasoning: A Survey, arXiv:2505.14340 [paper]
Towards Geometry Problem Solving in the Large Model Era: A Survey, arXiv:2506.02690 [paper]

Citation

If you find this repository useful, please consider citing our survey paper:

@article{ma2025survey,
  title={A Survey of Deep Learning for Geometry Problem Solving},
  author={Ma, Jianzhe and Wang, Wenxuan and Jin, Qin},
  journal={arXiv preprint arXiv:2507.11936},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Deep Learning for Geometry Problem Solving (DL4GPS)

Table of Contents

Surveys

Tasks and Datasets - Fundamental Tasks

Geometry Diagram Understanding

Semantic Parsing for Geometry Problem

Geometric Relation Extraction

Geometric Knowledge Prediction

Tasks and Datasets - Core Tasks

Geometry Theorem Proving

Geometric Numerical Calculation

Tasks and Datasets - Composite Tasks

Mathematical Reasoning

Tasks and Datasets - Other Geometry Tasks

Geometric Diagram Generation

Geometric Construction Problem

Geometric Figure Retrieval

Geometric Autoformalization

Methods - Architectures

Encoder-Decoder

Other Architectures

Methods - Training Stage

Pre-Training

Supervised Fine-Tuning

Reinforcement Learning

Methods - Inference Stage

Test-Time Scaling

Knowledge-Augmented Inference

Related Surveys

Years

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

2025

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages