Problem Guidelines
Creating high-quality benchmark problems for mathematical AI evaluation
Quick Start
Effective benchmark problems require PhD-level difficulty, genuine mathematical insight, and a clear proof-based main question. Optional auto-gradable subquestions can be useful when they naturally measure partial progress.
Open problems and conjectures are welcome. If the problem is open, say so in the solution field, add useful context or references, and tag it with open problem.
Required Characteristics
- At least PhD-level difficulty: Suitable for research papers, advanced seminars, or comparable expert-level work
- Requires genuine insight: Not solvable by routine application of known algorithms
- Clear proof-based main question: Answer should be a complete mathematical argument, not just a number
What to Avoid
- Problems solvable by pattern matching or lucky guessing
- Standard textbook exercises (even from graduate texts)
- Purely computational problems that Mathematica/SageMath can solve directly
Brainstorming Tips
Example Problems
Example 1: Stable Graphs
Main question: Find a closed formula for the number $N(g)$ of stable graphs of genus $g$ with no legs and precisely $3$ edges, for all $g \geq 2$.
Optional subquestions:
- What is $N(3)$?
- What is $N(8)$?
- What is $N(1000)$?
Example 2: Permutation Representations
Main question: Let $G$ be a finite group. Is the functor $\mathrm{Perm}: G\text{-sets} \to \mathrm{Rep}_{\mathbb{C}}(G)$ sending $X$ to its permutation representation fully faithful? Prove or provide a counterexample.
Optional subquestions:
- Is the statement true for all finite groups?
- Is the statement true for all finite cyclic groups?
- Is the statement true for all finite abelian groups?
Example Problem Structures
Intersection Theory
Optional subquestions: What is the degree of this class? Does it vanish in $A^2(X)$?
Optional subquestions: What is this number for specific parameter values?
Classification Problems
Optional subquestions: How many classes are there? Which have additional property $P$?
Optional subquestions: What is $\dim H^0(M)$? Is $H^n(M) = 0$ for $n > d$?