ARC Prize 2025 Top Score 3rd Place MindsAI

2025-12-05 · Source: ARC Prize · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Emerging Technologies & Innovation · Depth: Advanced, extended

Summary

Jack Cole, a third-place winner of the Ark Prize 2025, discusses his unique background in clinical psychology and AI research, including an app business with over 30 million downloads. He details his team's (MindsAI and Tufa Labs) winning solution for the ARC competition, which centers on "Test Time Training" (TTT) and "Augment Inference, Reverse Augmentation, and Vote" (ARV). TTT involves using training examples as test items to unlock model performance, while ARV applies geometric and color augmentations to test items, reverses them, and then votes on predictions, yielding up to a 1,000% gain. Cole also introduced novel augmentations like mixup and combination, and tokenizer dropout during inference. He emphasizes the importance of rapid experimental iteration over strict scientific method for progress and highlights the community's role in fostering innovation in AI.

Key takeaway

For AI Engineers and Research Scientists working on model generalization, consider integrating dynamic test-time adaptation techniques like Test Time Training (TTT) and Augment Inference, Reverse Augmentation, and Vote (ARV). Your models can achieve significant performance gains by dynamically updating and processing inputs from multiple augmented perspectives, especially when dealing with constrained environments or tasks requiring fluid intelligence. Explore novel augmentations and tokenizer dropout during inference to push model capabilities beyond static pre-training.

Key insights

Integrating psychology with AI led to novel methods for evaluating and enhancing large language models' generalization.

Principles

Rapid iteration accelerates AI development.
Community involvement is crucial for innovation.
Deep learning models benefit from dynamic updating.

Method

The ARV method applies geometric and color augmentations to test items, reverses them, and then uses majority voting across predictions to cancel out model biases, significantly boosting performance.

In practice

Apply tokenizer dropout during inference for varied tokenization.
Use on-the-fly data augmentation to keep datasets dynamic.
Reverse sequences in pre-training for greater model dynamism.

Topics

ARC Prize
Test Time Training
Augmentation Techniques
Tokenizer Dropout
LLM Limitations

Best for: AI Scientist, Research Scientist, AI Engineer, AI Researcher, Machine Learning Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by ARC Prize.