Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
<a href="https://arxiv.org/abs/2407.11214"><img src="https://img.shields.io/badge/arXiv-2407.11214-b31b1b.svg"></a>
</p>

PutnamBench is a benchmark for evaluation of theorem-proving algorithms on competition mathematics problems sourced from the William Lowell Putnam Mathematical Competition years 1962 - 2024. Our formalizations currently support three formal languages : Lean 4 $\land$ Isabelle $\land$ Coq. PutnamBench comprises of 1712 manually-crafted formalizations, aggregated over all languages.
PutnamBench is a benchmark for evaluation of theorem-proving algorithms on competition mathematics problems sourced from the William Lowell Putnam Mathematical Competition years 1962 - 2024. Our formalizations currently support three formal languages : Lean 4 $\land$ Isabelle $\land$ Coq. PutnamBench comprises of 1724 manually-crafted formalizations, aggregated over all languages.

PutnamBench aims to support research in automated mathematical reasoning by providing a multilingual benchmark for evaluating theorem-proving algorithms. It is released under permissive licenses (Apache 2.0 for Lean 4 and Isabelle, MIT for Coq). The [informal statements](informal/README.md) are also available with permission from the MAA.

Expand All @@ -19,7 +19,7 @@ We are hosting a [**leaderboard**](https://trishullab.github.io/PutnamBench/lead
## Statistics
| Language | Count |
| ------------- | -------------- |
| Lean 4 | 660 |
| Lean 4 | 672 |
| Isabelle | 640 |
| Coq | 412 |

Expand All @@ -28,11 +28,11 @@ We also report the number of problems in a certain category. Note that some prob
| Category | Total Quantity |
| ---------------- | -------------- |
| Algebra | 253 |
| Analysis | 226 |
| Number Theory | 108 |
| Geometry | 69 |
| Linear Algebra | 51 |
| Combinatorics | 29 |
| Analysis | 229 |
| Number Theory | 113 |
| Geometry | 71 |
| Linear Algebra | 53 |
| Combinatorics | 33 |
| Abstract Algebra | 28 |
| Probability | 10 |
| Set Theory | 8 |
Expand Down
2 changes: 1 addition & 1 deletion docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ <h2 class="title is-3">Abstract</h2>
We present PutnamBench, a new multilingual benchmark for evaluating the ability of neural theorem-provers to solve competition mathematics problems.
</p>
<p>
PutnamBench consists of 1712 hand-constructed formalizations of problems sourced from the William Lowell Putnam Mathematical Competition, the premier undergraduate-level mathematics competition in North America.
PutnamBench consists of 1724 hand-constructed formalizations of problems sourced from the William Lowell Putnam Mathematical Competition, the premier undergraduate-level mathematics competition in North America.
There are 660 problems formalized in Lean 4, 640 formalized in Isabelle, and 412 formalized in Coq.
</p>
<p>
Expand Down
100 changes: 100 additions & 0 deletions informal/putnam.json
Original file line number Diff line number Diff line change
Expand Up @@ -5424,5 +5424,105 @@
"tags": [
"analysis"
]
},
{
"problem_name": "putnam_2025_a1",
"informal_statement": "Let $m_0$ and $n_0$ be distinct positive integers. For every positive integer $k$,\ndefine $m_k$ and $n_k$ to be the relatively prime positive integers such that\n$$\\frac{m_k}{n_k} = \\frac{2m_{k-1} + 1}{2n_{k-1} + 1}.$$\nProve that $2m_k + 1$ and $2n_k + 1$ are relatively prime for all but finitely many\npositive integers $k$.",
"informal_solution": "None.",
"tags": [
"number_theory"
]
},
{
"problem_name": "putnam_2025_a2",
"informal_statement": "Find the largest real number $a$ and the smallest real number $b$ such that\n$$ax(\\pi - x) \\le \\sin x \\le bx(\\pi - x)$$\nfor all $x$ in the interval $[0, \\pi]$.",
"informal_solution": "Show that the optimal constants are $a = \\tfrac{1}{\\pi}$ and $b = \\tfrac{4}{\\pi^2}$.",
"tags": [
"analysis"
]
},
{
"problem_name": "putnam_2025_a3",
"informal_statement": "Alice and Bob play a game with a string of $n$ digits, each of which is restricted\nto be 0, 1, or 2. Initially all the digits are 0. A legal move is to add or subtract 1\nfrom one digit to create a new string that has not appeared before. A player with no\nlegal move loses, and the other player wins. Alice goes first, and the players alternate\nmoves. For each $n \\ge 1$, determine which player has a strategy that guarantees winning.",
"informal_solution": "Show that Bob has a winning strategy for all $n \\ge 1$.",
"tags": [
"combinatorics"
]
},
{
"problem_name": "putnam_2025_a4",
"informal_statement": "Find the minimal value of $k$ such that there exist $k$-by-$k$ real matrices\n$A_1, \\ldots, A_{2025}$ with the property that $A_i A_j = A_j A_i$ if and only if\n$|i - j| \\in \\{0, 1, 2024\\}$.",
"informal_solution": "Show that the minimal value is $3$.",
"tags": [
"linear_algebra"
]
},
{
"problem_name": "putnam_2025_a5",
"informal_statement": "Let $n$ be an integer with $n \\ge 2$. For a sequence $s = (s_1, \\ldots, s_{n-1})$ where each\n$s_i = \\pm 1$, let $f(s)$ be the number of permutations $(a_1, \\ldots, a_n)$ of $\\{1, 2, \\ldots, n\\}$\nsuch that $s_i(a_{i+1} - a_i) > 0$ for all $i$. For each $n$, determine the sequences $s$ for which\n$f(s)$ is maximal.",
"informal_solution": "Show that $f(s)$ is maximized exactly when the signs alternate, that is, when $s_i = (-1)^{i+1}$ for all $i$ or $s_i = (-1)^i$ for all $i$.",
"tags": [
"combinatorics"
]
},
{
"problem_name": "putnam_2025_a6",
"informal_statement": "Let $b_0 = 0$ and, for $n \\ge 0$, define $b_{n+1} = 2b_n^2 + b_n + 1$.\nFor each $k \\ge 1$, show that $b_{2^{k+1}} - 2b_{2^k}$ is divisible by $2^{2k+2}$\nbut not by $2^{2k+3}$.",
"informal_solution": "None.",
"tags": [
"number_theory"
]
},
{
"problem_name": "putnam_2025_b1",
"informal_statement": "Suppose that each point in the plane is colored either red or green, subject to the following\ncondition: For every three noncollinear points $A$, $B$, $C$ of the same color, the center of the\ncircle passing through $A$, $B$, $C$ is also this color. Prove that all points of the plane are the\nsame color.",
"informal_solution": "Show that all points of the plane must have the same color.",
"tags": [
"geometry",
"combinatorics"
]
},
{
"problem_name": "putnam_2025_b2",
"informal_statement": "Let $f: [0,1] \\to [0, \\infty)$ be strictly increasing and continuous.\nLet $R$ be the region bounded by $x = 0$, $x = 1$, $y = 0$, and $y = f(x)$.\nLet $x_1$ be the $x$-coordinate of the centroid of $R$.\nLet $x_2$ be the $x$-coordinate of the centroid of the solid generated by rotating $R$ about the $x$-axis.\nProve that $x_1 < x_2$.",
"informal_solution": "None.",
"tags": [
"analysis",
"geometry"
]
},
{
"problem_name": "putnam_2025_b3",
"informal_statement": "Suppose $S$ is a nonempty set of positive integers with the property that if $n$ is in $S$,\nthen every positive divisor of $2025^n - 15^n$ is in $S$. Must $S$ contain all positive integers?",
"informal_solution": "Yes, $S$ must contain all positive integers.",
"tags": [
"number_theory"
]
},
{
"problem_name": "putnam_2025_b4",
"informal_statement": "For $n \\geq 2$, let $A = [a_{i,j}]_{i,j=1}^n$ be an $n$-by-$n$ matrix of nonnegative integers such that:\n(a) $a_{i,j} = 0$ when $i + j \\leq n$;\n(b) $a_{i+1,j} \\in \\{a_{i,j}, a_{i,j} + 1\\}$ when $1 \\leq i \\leq n-1$ and $1 \\leq j \\leq n$; and\n(c) $a_{i,j+1} \\in \\{a_{i,j}, a_{i,j} + 1\\}$ when $1 \\leq i \\leq n$ and $1 \\leq j \\leq n-1$.\n\nLet $S$ be the sum of the entries of $A$, and let $N$ be the number of nonzero entries of $A$.\nProve that $S \\leq \\frac{(n+2)N}{3}$.",
"informal_solution": "None.",
"tags": [
"linear_algebra",
"combinatorics"
]
},
{
"problem_name": "putnam_2025_b5",
"informal_statement": "Let $p$ be a prime number greater than 3. For each $k \\in \\{1, \\ldots, p-1\\}$,\nlet $I(k) \\in \\{1, 2, \\ldots, p-1\\}$ be such that $k \\cdot I(k) \\equiv 1 \\pmod{p}$.\nProve that the number of integers $k \\in \\{1, \\ldots, p-2\\}$ such that\n$I(k+1) < I(k)$ is greater than $p/4 - 1$.",
"informal_solution": "None.",
"tags": [
"number_theory"
]
},
{
"problem_name": "putnam_2025_b6",
"informal_statement": "Let $\\mathbb{N} = \\{1, 2, 3, \\ldots\\}$. Find the largest real constant $r$ such that\nthere exists a function $g: \\mathbb{N} \\to \\mathbb{N}$ such that\n$$g(n+1) - g(n) \\geq (g(g(n)))^r$$\nfor all $n \\in \\mathbb{N}$.",
"informal_solution": "Show that the largest possible value of $r$ is $1/4$.",
"tags": [
"analysis",
"number_theory"
]
}
]
19 changes: 19 additions & 0 deletions lean4/src/putnam_2025_a1.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import Mathlib

/--
Let $m_0$ and $n_0$ be distinct positive integers. For every positive integer $k$,
define $m_k$ and $n_k$ to be the relatively prime positive integers such that
$$\frac{m_k}{n_k} = \frac{2m_{k-1} + 1}{2n_{k-1} + 1}.$$
Prove that $2m_k + 1$ and $2n_k + 1$ are relatively prime for all but finitely many
positive integers $k$.
-/
theorem putnam_2025_a1
(m n : ℕ → ℕ)
(hm : ∀ k : ℕ, 0 < m k)
(hn : ∀ k : ℕ, 0 < n k)
(h_distinct : m 0 ≠ n 0)
(h_coprime : ∀ k : ℕ, 0 < k → Nat.Coprime (m k) (n k))
(h_recurrence : ∀ k : ℕ,
(m (k + 1) : ℚ) / (n (k + 1) : ℚ) = (2 * (m k : ℚ) + 1) / (2 * (n k : ℚ) + 1)) :
{k : ℕ | ¬Nat.Coprime (2 * m k + 1) (2 * n k + 1)}.Finite :=
sorry
17 changes: 17 additions & 0 deletions lean4/src/putnam_2025_a2.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import Mathlib

open Real

noncomputable abbrev putnam_2025_a2_solution : ℝ × ℝ := sorry
-- (1 / π, 4 / π ^ 2)

/--
Find the largest real number $a$ and the smallest real number $b$ such that
$$ax(\pi - x) \le \sin x \le bx(\pi - x)$$
for all $x$ in the interval $[0, \pi]$.
-/
theorem putnam_2025_a2 (a b : ℝ) :
((a, b) = putnam_2025_a2_solution) ↔
(IsGreatest {a' : ℝ | ∀ x ∈ Set.Icc 0 π, a' * x * (π - x) ≤ sin x} a ∧
IsLeast {b' : ℝ | ∀ x ∈ Set.Icc 0 π, sin x ≤ b' * x * (π - x)} b) :=
sorry
38 changes: 38 additions & 0 deletions lean4/src/putnam_2025_a3.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import Mathlib

open Finset Function

abbrev putnam_2025_a3_solution : ℕ → Prop := sorry
-- fun _ => False

def GameString (n : ℕ) := Fin n → Fin 3

def initialState (n : ℕ) : GameString n := fun _ => 0

def isValidMove {n : ℕ} (s1 s2 : GameString n) : Prop :=
(∃! i : Fin n, s1 i ≠ s2 i) ∧
∀ i : Fin n, s1 i ≠ s2 i →
((s1 i).val + 1 = (s2 i).val ∨ (s2 i).val + 1 = (s1 i).val)

def IsValidGamePlay {n : ℕ} (play : List (GameString n)) : Prop :=
play.Chain isValidMove (initialState n) ∧
(initialState n :: play).Nodup

inductive HasWinningStrategy (n : ℕ) : List (GameString n) → Prop where
| win (play : List (GameString n)) (s : GameString n) :
IsValidGamePlay (play ++ [s]) →
(∀ s', IsValidGamePlay (play ++ [s, s']) → HasWinningStrategy n (play ++ [s, s'])) →
HasWinningStrategy n play

def AliceHasWinningStrategy (n : ℕ) : Prop := HasWinningStrategy n []

/--
Alice and Bob play a game with a string of $n$ digits, each of which is restricted
to be 0, 1, or 2. Initially all the digits are 0. A legal move is to add or subtract 1
from one digit to create a new string that has not appeared before. A player with no
legal move loses, and the other player wins. Alice goes first, and the players alternate
moves. For each $n \ge 1$, determine which player has a strategy that guarantees winning.
-/
theorem putnam_2025_a3 (n : ℕ) (hn : 1 ≤ n) :
putnam_2025_a3_solution n ↔ AliceHasWinningStrategy n := by
sorry
16 changes: 16 additions & 0 deletions lean4/src/putnam_2025_a4.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
import Mathlib

abbrev putnam_2025_a4_solution : ℕ := sorry
-- 3

/--
Find the minimal value of $k$ such that there exist $k$-by-$k$ real matrices
$A_1, \ldots, A_{2025}$ with the property that $A_i A_j = A_j A_i$ if and only if
$|i - j| \in \{0, 1, 2024\}$.
-/
theorem putnam_2025_a4 :
IsLeast {k : ℕ | ∃ A : Fin 2025 → Matrix (Fin k) (Fin k) ℝ,
∀ i j : Fin 2025, i ≤ j →
(A i * A j = A j * A i ↔ j.val - i.val ∈ ({0, 1, 2024} : Set ℕ))}
putnam_2025_a4_solution :=
sorry
23 changes: 23 additions & 0 deletions lean4/src/putnam_2025_a5.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
import Mathlib

open Finset

abbrev putnam_2025_a5_solution : (n : ℕ) → Set (Fin n → ℤˣ) := sorry
-- fun n => {s | (∀ i : Fin n, s i = (-1) ^ (i.val + 1)) ∨ (∀ i : Fin n, s i = (-1) ^ i.val)}

def f (n : ℕ) (s : Fin n → ℤˣ) : ℕ :=
Finset.card {σ : Equiv.Perm (Fin (n + 1)) |
∀ i : Fin n, 0 < (s i : ℤ) * ((σ i.succ : ℤ) - (σ i.castSucc : ℤ))}

/--
Let $n$ be an integer with $n \ge 2$. For a sequence $s = (s_1, \ldots, s_{n-1})$ where each
$s_i = \pm 1$, let $f(s)$ be the number of permutations $(a_1, \ldots, a_n)$ of $\{1, 2, \ldots, n\}$
such that $s_i(a_{i+1} - a_i) > 0$ for all $i$. For each $n$, determine the sequences $s$ for which
$f(s)$ is maximal.
-/
theorem putnam_2025_a5
(n : ℕ)
(hn : 1 ≤ n)
(s : Fin n → ℤˣ) :
(∀ t : Fin n → ℤˣ, f n t ≤ f n s) ↔ s ∈ putnam_2025_a5_solution n := by
sorry
15 changes: 15 additions & 0 deletions lean4/src/putnam_2025_a6.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import Mathlib

def b : ℕ → ℤ
| 0 => 0
| n + 1 => 2 * (b n) ^ 2 + b n + 1

/--
Let $b_0 = 0$ and, for $n \ge 0$, define $b_{n+1} = 2b_n^2 + b_n + 1$.
For each $k \ge 1$, show that $b_{2^{k+1}} - 2b_{2^k}$ is divisible by $2^{2k+2}$
but not by $2^{2k+3}$.
-/
theorem putnam_2025_a6 (k : ℕ) (hk : 1 ≤ k) :
(2 ^ (2 * k + 2) : ℤ) ∣ (b (2 ^ (k + 1)) - 2 * b (2 ^ k)) ∧
¬((2 ^ (2 * k + 3) : ℤ) ∣ (b (2 ^ (k + 1)) - 2 * b (2 ^ k))) := by
sorry
17 changes: 17 additions & 0 deletions lean4/src/putnam_2025_b1.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import Mathlib

open Affine EuclideanGeometry

/--
Suppose that each point in the plane is colored either red or green, subject to the following
condition: For every three noncollinear points $A$, $B$, $C$ of the same color, the center of the
circle passing through $A$, $B$, $C$ is also this color. Prove that all points of the plane are the
same color.
-/
theorem putnam_2025_b1
(color : EuclideanSpace ℝ (Fin 2) → Bool)
(h : ∀ (s : Simplex ℝ (EuclideanSpace ℝ (Fin 2)) 2),
(∀ i j : Fin 3, color (s.points i) = color (s.points j)) →
color s.circumcenter = color (s.points 0)) :
∃ c : Bool, ∀ P : EuclideanSpace ℝ (Fin 2), color P = c := by
sorry
19 changes: 19 additions & 0 deletions lean4/src/putnam_2025_b2.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import Mathlib

open Set Real MeasureTheory Interval

/--
Let $f: [0,1] \to [0, \infty)$ be strictly increasing and continuous.
Let $R$ be the region bounded by $x = 0$, $x = 1$, $y = 0$, and $y = f(x)$.
Let $x_1$ be the $x$-coordinate of the centroid of $R$.
Let $x_2$ be the $x$-coordinate of the centroid of the solid generated by rotating $R$ about the $x$-axis.
Prove that $x_1 < x_2$.
-/
theorem putnam_2025_b2
(f : ℝ → ℝ)
(hf_cont : ContinuousOn f (Icc 0 1))
(hf_mono : StrictMonoOn f (Icc 0 1))
(hf_nonneg : ∀ x ∈ Icc (0 : ℝ) 1, 0 ≤ f x) :
(∫ x in (0:ℝ)..1, x * f x) / (∫ x in (0:ℝ)..1, f x) <
(∫ x in (0:ℝ)..1, x * (f x) ^ 2) / (∫ x in (0:ℝ)..1, (f x) ^ 2) := by
sorry
19 changes: 19 additions & 0 deletions lean4/src/putnam_2025_b3.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import Mathlib

open Finset

abbrev putnam_2025_b3_solution : Prop := sorry
-- True

/--
Suppose $S$ is a nonempty set of positive integers with the property that if $n$ is in $S$,
then every positive divisor of $2025^n - 15^n$ is in $S$. Must $S$ contain all positive integers?
-/
theorem putnam_2025_b3 :
putnam_2025_b3_solution ↔
∀ S : Set ℕ,
S.Nonempty →
(∀ n ∈ S, 0 < n) →
(∀ n ∈ S, ∀ d : ℕ, 0 < d → d ∣ (2025 ^ n - 15 ^ n) → d ∈ S) →
S = {n : ℕ | 0 < n} := by
sorry
29 changes: 29 additions & 0 deletions lean4/src/putnam_2025_b4.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
import Mathlib

open BigOperators Finset Matrix

/--
For $n \geq 2$, let $A = [a_{i,j}]_{i,j=1}^n$ be an $n$-by-$n$ matrix of nonnegative integers such that:
(a) $a_{i,j} = 0$ when $i + j \leq n$;
(b) $a_{i+1,j} \in \{a_{i,j}, a_{i,j} + 1\}$ when $1 \leq i \leq n-1$ and $1 \leq j \leq n$; and
(c) $a_{i,j+1} \in \{a_{i,j}, a_{i,j} + 1\}$ when $1 \leq i \leq n$ and $1 \leq j \leq n-1$.

Let $S$ be the sum of the entries of $A$, and let $N$ be the number of nonzero entries of $A$.
Prove that $S \leq \frac{(n+2)N}{3}$.
-/
theorem putnam_2025_b4
(m : ℕ)
(A : Matrix (Fin (m + 2)) (Fin (m + 2)) ℕ)
(ha : ∀ (i j : Fin (m + 2)), (i : ℕ) + (j : ℕ) ≤ m → A i j = 0)
(hb : ∀ (i : Fin (m + 1)) (j : Fin (m + 2)),
A (Fin.succ i) j = A (Fin.castSucc i) j ∨
A (Fin.succ i) j = A (Fin.castSucc i) j + 1)
(hc : ∀ (i : Fin (m + 2)) (j : Fin (m + 1)),
A i (Fin.succ j) = A i (Fin.castSucc j) ∨
A i (Fin.succ j) = A i (Fin.castSucc j) + 1)
(S : ℕ)
(hS : S = ∑ i : Fin (m + 2), ∑ j : Fin (m + 2), A i j)
(N : ℕ)
(hN : N = #{p : Fin (m + 2) × Fin (m + 2) | A p.1 p.2 ≠ 0}) :
3 * S ≤ (m + 4) * N := by
sorry
21 changes: 21 additions & 0 deletions lean4/src/putnam_2025_b5.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import Mathlib

open Finset BigOperators

def modInv (p : ℕ) (k : ℕ) : ℕ := ZMod.val ((k : ZMod p)⁻¹)

def descentCount (p : ℕ) : ℕ :=
#{k ∈ Finset.Icc 1 (p - 2) | modInv p (k + 1) < modInv p k}

/--
Let $p$ be a prime number greater than 3. For each $k \in \{1, \ldots, p-1\}$,
let $I(k) \in \{1, 2, \ldots, p-1\}$ be such that $k \cdot I(k) \equiv 1 \pmod{p}$.
Prove that the number of integers $k \in \{1, \ldots, p-2\}$ such that
$I(k+1) < I(k)$ is greater than $p/4 - 1$.
-/
theorem putnam_2025_b5
(p : ℕ)
(hp_prime : p.Prime)
(hp_gt : 3 < p) :
(p : ℚ) / 4 - 1 < descentCount p := by
sorry
Loading