Rethinking Pre-Training for Agentic AI with Aakanksha Chowdhery - #759

· Source: The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Software Development & Engineering · Depth: Expert, extended

Summary

Aakanksha Chowdhery, a member of technical staff at Reflection and former lead for Google's PaLM and early Gemini pre-training efforts, argues for a fundamental rethinking of pre-training to achieve true agentic AI. She contends that current post-training techniques are insufficient for developing multi-step reasoning and planning capabilities required for agents. Chowdhery highlights limitations of next-token prediction for complex workflows and emphasizes the need to evolve attention mechanisms, loss objectives, and training data. Key areas for evolution include improving long-form reasoning over extended contexts, learning from "trajectory" training data that captures multi-step problem-solving, and developing loss functions that teach models to learn from failures and adapt to new tools. She stresses that scaling remains crucial for discovering emergent agentic capabilities like error recovery and dynamic tool learning, and that new benchmarks are needed to measure these advanced forms of intelligence.

Key takeaway

For AI Scientists and Research Scientists focused on developing advanced agentic systems, you should prioritize foundational changes in pre-training rather than solely relying on post-training techniques. Your efforts should concentrate on evolving attention mechanisms, loss objectives, and training data to foster long-form reasoning, planning, and dynamic tool learning, as these capabilities are emergent at scale and critical for next-generation AI agents. Consider contributing to new benchmark development to accurately measure these complex agentic behaviors.

Key insights

Agentic AI requires fundamental pre-training shifts, not just post-training, to enable multi-step reasoning and dynamic tool use.

Principles

Method

Rethink pre-training by evolving attention mechanisms for long-form reasoning, designing loss objectives that teach multi-step planning and tool use, and incorporating high-quality, diverse "trajectory" training data.

In practice

Topics

Best for: AI Scientist, Research Scientist, AI Engineer, Machine Learning Engineer, AI Researcher

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence).