How Google DeepMind is researching the next Frontier of AI for Gemini — Raia Hadsell, VP of Research
Summary
Rya Hazdell, VP of Research at Google DeepMind, discussed three key areas of Frontier AI development beyond traditional language models. She introduced Gemini Embedded 2, an omnimodal embedding model derived from Gemini, capable of processing text, video, audio, and PDFs into a single vector for fast retrieval and comparison, supporting up to 8K tokens, 128 seconds of video, and 80 seconds of audio. Hazdell also detailed DeepMind's advancements in weather prediction, including GraphCast, which predicts global atmospheric conditions 15 days out with higher accuracy than physics-based models, and GenCast, a probabilistic model that is 97% more accurate across 1300 benchmarks and generates 15-day forecasts in eight minutes on a single chip. The latest, FGN, directly predicts cyclones. Finally, she presented World Models, specifically Genie 3, which generates diverse, interactive, high-quality 3D environments with memory and real-time interactive prompting, allowing users to dynamically alter virtual worlds.
Key takeaway
For AI Scientists and Machine Learning Engineers developing multimodal systems, consider integrating Gemini Embedded 2 to achieve robust, unified representations across text, audio, and video, streamlining retrieval and comparison tasks. Your teams should also investigate DeepMind's weather prediction models, particularly GenCast and FGN, for their superior accuracy and efficiency over traditional physics-based simulations, which could significantly improve forecasting capabilities. Furthermore, explore the World Models like Genie 3 for creating dynamic, interactive environments, offering new paradigms for gaming, simulation, and educational applications.
Key insights
DeepMind is advancing Frontier AI through omnimodal embeddings, superior weather prediction, and interactive world generation.
Principles
- Focus on "root nodes" for deep, impactful problem-solving.
- Embedding models are critical companions to generative AI.
- Probabilistic models enhance operational weather prediction.
Method
DeepMind's approach involves identifying fundamental, unsolved problems, developing AI models like spherical graph neural networks for complex data, and iterating on models to improve accuracy, efficiency, and real-time interaction.
In practice
- Use Gemini Embedded 2 for unified multimodal retrieval.
- Apply GenCast for highly accurate, efficient weather forecasts.
- Explore Genie 3 for dynamic, interactive 3D environment creation.
Topics
- Frontier AI
- Gemini Embedded 2
- Omnimodal Embedding Models
- Weather Prediction AI
- Graph Neural Networks
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AI Engineer.