Problem Guidelines
Creating high-quality benchmark problems for mathematical AI evaluation
Quick Start
Effective benchmark problems require PhD-level difficulty, genuine mathematical insight, and 2-3 auto-gradable subquestions. Think about recent calculations from your research that required a clever insight or non-obvious proof techniques.
Required Characteristics
- PhD-level difficulty: Suitable for qualifying exams, research papers, or advanced seminars
- Requires genuine insight: Not solvable by routine application of known algorithms
- Clear proof-based main question: Answer should be a complete mathematical argument, not just a number
- 2-3 unique-answer subquestions: Enable automated evaluation (e.g., "Is the statement true for n=5?", "What is the rank of this group?")
What to Avoid
- Problems solvable by pattern matching or lucky guessing
- Standard textbook exercises (even from graduate texts)
- Purely computational problems that Mathematica/SageMath can solve directly
- Problems without clear subquestions for automated evaluation
Problem Templates
Intersection Theory
Subquestions: What is the degree of this class? Does it vanish in $A^2(X)$?
Subquestions: What is this number for specific parameter values?
Classification Problems
Subquestions: How many classes are there? Which have additional property $P$?
Subquestions: What is $\dim H^0(M)$? Is $H^n(M) = 0$ for $n > d$?
Example Problems
Example 1: Stable Graphs
Main question: Find a closed formula for the number $N(g)$ of stable graphs of genus $g$ with no legs and precisely $3$ edges, for all $g \geq 2$.
Subquestions:
- What is $N(3)$?
- What is $N(8)$?
- What is $N(1000)$?
Example 2: Permutation Representations
Main question: Let $G$ be a finite group. Is the functor $\mathrm{Perm}: G\text{-sets} \to \mathrm{Rep}_{\mathbb{C}}(G)$ sending $X$ to its permutation representation fully faithful? Prove or provide a counterexample.
Subquestions:
- Is the statement true for all finite groups?
- Is the statement true for all finite cyclic groups?
- Is the statement true for all finite abelian groups?
Brainstorming Tips
Ready to Contribute?
Start creating your problem using our editor with LaTeX support and AI testing.
Create New Problem