ARC-AGI Community Leaderboard

ARC-AGI has gained significant popularity over the past two years, and we've been overwhelmed by the number of researchers and builders who want to showcase their work to the community. The ARC-AGI Community Leaderboard provides a landing spot for these submissions, where the community can review, discuss, and verify results together.

Community Leaderboard submissions must be general purpose and reproducible. Scores are self-reported on the ARC-AGI-1 and ARC-AGI-2 semi-private sets and the ARC-AGI-3 public set. ARC Prize will not independently verify submissions except in extraordinary cases, and we reserve the right to determine what qualifies. For more on how we approach testing, see our testing policy.

To submit your work, head to the ARC-AGI Community Leaderboard repo on GitHub.

NameAuthorsBenchmarkScoreCostDateLinks
Read-Grep-Bash Agent

A coding agent that uses search and Python scripting over game logs.

Alexis Fox, Junlin Wang,
Paul Rosu
, Bhuwan Dhingra
ARC-AGI-382.4%*$1792026-03-13
Evolutionary Test-Time Compute with Natural Language Instructions

Evolves natural language instructions instead of code.

Jeremy BermanARC-AGI-229.4%$3,6482025-09-16
Efficient Evolutionary Program Synthesis

Evolves a growing library of Python programs with an LLM.

Eric PangARC-AGI-226.0%$4762025-09-01
Tiny Recursive Model (TRM)

7M parameter recursive model with think-act refinement loops.

Alexia Jolicoeur-MartineauARC-AGI-27.8%*$2522025-07-01
Hierarchical Reasoning Model (HRM)

Brain-inspired 27M parameter model with iterative refinement.

Sapient IntelligenceARC-AGI-22.0%$2012025-06-08
Evolutionary Test-time Compute

Genetic algorithm over LLM-generated Python transforms.

Jeremy BermanARC-AGI-153.6%$2,9002024-12-18
Ryan Greenblatt

LLM generates and refines thousands of candidate programs per task.

Ryan GreenblattARC-AGI-143.0%$40,0002024-06-17

* denotes score on public set