Problem Guidelines

Creating high-quality benchmark problems for mathematical AI evaluation

Quick Start

Effective benchmark problems require PhD-level difficulty, genuine mathematical insight, and a clear proof-based main question. Optional auto-gradable subquestions can be useful when they naturally measure partial progress.

Open problems and conjectures are welcome. If the problem is open, say so in the solution field, add useful context or references, and tag it with open problem.

Required Characteristics

  • At least PhD-level difficulty: Suitable for research papers, advanced seminars, or comparable expert-level work
  • Requires genuine insight: Not solvable by routine application of known algorithms
  • Clear proof-based main question: Answer should be a complete mathematical argument, not just a number

What to Avoid

  • Problems solvable by pattern matching or lucky guessing
  • Standard textbook exercises (even from graduate texts)
  • Purely computational problems that Mathematica/SageMath can solve directly

Brainstorming Tips

A tricky calculation from your recent work that required a clever insight
An "obvious" statement that actually needs a non-trivial proof
A self-contained lemma that came up in a research project
An open problem from your field or a conjecture you made in an article

Example Problems

Example 1: Stable Graphs

Main question: Find a closed formula for the number $N(g)$ of stable graphs of genus $g$ with no legs and precisely $3$ edges, for all $g \geq 2$.

Optional subquestions:

  • What is $N(3)$?
  • What is $N(8)$?
  • What is $N(1000)$?
Example 2: Permutation Representations

Main question: Let $G$ be a finite group. Is the functor $\mathrm{Perm}: G\text{-sets} \to \mathrm{Rep}_{\mathbb{C}}(G)$ sending $X$ to its permutation representation fully faithful? Prove or provide a counterexample.

Optional subquestions:

  • Is the statement true for all finite groups?
  • Is the statement true for all finite cyclic groups?
  • Is the statement true for all finite abelian groups?

Example Problem Structures

Intersection Theory
Main: Let $X$ be [variety]. Compute the class of [specific cycle] in the Chow ring $A^*(X)$.
Optional subquestions: What is the degree of this class? Does it vanish in $A^2(X)$?
Main: For the moduli space $M$ of [objects], compute a closed formula for the intersection number $\int_M \alpha_1 \cup \alpha_2 \cup \ldots \cup \alpha_n$.
Optional subquestions: What is this number for specific parameter values?
Classification Problems
Main: Classify all [objects] with [property]. Give explicit representatives for each isomorphism class.
Optional subquestions: How many classes are there? Which have additional property $P$?
Main: What is the rank of the cohomology group $H^k(M)$ for [variety/moduli space $M$]?
Optional subquestions: What is $\dim H^0(M)$? Is $H^n(M) = 0$ for $n > d$?