ARC Prize 2025 Top Score 3rd Place MindsAI
Summary
Jack Cole, a third-place winner of the Ark Prize 2025, discusses his unique background in clinical psychology and AI research, including an app business with over 30 million downloads. He details his team's (MindsAI and Tufa Labs) winning solution for the ARC competition, which centers on "Test Time Training" (TTT) and "Augment Inference, Reverse Augmentation, and Vote" (ARV). TTT involves using training examples as test items to unlock model performance, while ARV applies geometric and color augmentations to test items, reverses them, and then votes on predictions, yielding up to a 1,000% gain. Cole also introduced novel augmentations like mixup and combination, and tokenizer dropout during inference. He emphasizes the importance of rapid experimental iteration over strict scientific method for progress and highlights the community's role in fostering innovation in AI.
Key takeaway
For AI Engineers and Research Scientists working on model generalization, consider integrating dynamic test-time adaptation techniques like Test Time Training (TTT) and Augment Inference, Reverse Augmentation, and Vote (ARV). Your models can achieve significant performance gains by dynamically updating and processing inputs from multiple augmented perspectives, especially when dealing with constrained environments or tasks requiring fluid intelligence. Explore novel augmentations and tokenizer dropout during inference to push model capabilities beyond static pre-training.
Key insights
Integrating psychology with AI led to novel methods for evaluating and enhancing large language models' generalization.
Principles
- Rapid iteration accelerates AI development.
- Community involvement is crucial for innovation.
- Deep learning models benefit from dynamic updating.
Method
The ARV method applies geometric and color augmentations to test items, reverses them, and then uses majority voting across predictions to cancel out model biases, significantly boosting performance.
In practice
- Apply tokenizer dropout during inference for varied tokenization.
- Use on-the-fly data augmentation to keep datasets dynamic.
- Reverse sequences in pre-training for greater model dynamism.
Topics
- ARC Prize
- Test Time Training
- Augmentation Techniques
- Tokenizer Dropout
- LLM Limitations
Best for: AI Scientist, Research Scientist, AI Engineer, AI Researcher, Machine Learning Engineer, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by ARC Prize.