ARC Prize 2025 Top Score 3rd Place MindsAI

· Source: ARC Prize · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Emerging Technologies & Innovation · Depth: Advanced, extended

Summary

Jack Cole, a third-place winner of the Ark Prize 2025, discusses his unique background in clinical psychology and AI research, including an app business with over 30 million downloads. He details his team's (MindsAI and Tufa Labs) winning solution for the ARC competition, which centers on "Test Time Training" (TTT) and "Augment Inference, Reverse Augmentation, and Vote" (ARV). TTT involves using training examples as test items to unlock model performance, while ARV applies geometric and color augmentations to test items, reverses them, and then votes on predictions, yielding up to a 1,000% gain. Cole also introduced novel augmentations like mixup and combination, and tokenizer dropout during inference. He emphasizes the importance of rapid experimental iteration over strict scientific method for progress and highlights the community's role in fostering innovation in AI.

Key takeaway

For AI Engineers and Research Scientists working on model generalization, consider integrating dynamic test-time adaptation techniques like Test Time Training (TTT) and Augment Inference, Reverse Augmentation, and Vote (ARV). Your models can achieve significant performance gains by dynamically updating and processing inputs from multiple augmented perspectives, especially when dealing with constrained environments or tasks requiring fluid intelligence. Explore novel augmentations and tokenizer dropout during inference to push model capabilities beyond static pre-training.

Key insights

Integrating psychology with AI led to novel methods for evaluating and enhancing large language models' generalization.

Principles

Method

The ARV method applies geometric and color augmentations to test items, reverses them, and then uses majority voting across predictions to cancel out model biases, significantly boosting performance.

In practice

Topics

Best for: AI Scientist, Research Scientist, AI Engineer, AI Researcher, Machine Learning Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by ARC Prize.