The inaugural Redwood Research podcast

2026-01-04 · Source: Redwood Research blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Cybersecurity & Data Privacy · Depth: Expert, extended

Summary

Redwood Research has launched its inaugural podcast, featuring a discussion between Buck and Ryan on various topics, including AI safety, the future of AI, and the organization's history. The podcast delves into their P(doom) assessments, with Ryan estimating a 35% chance of misaligned AI takeover and a 50% chance of catastrophic outcomes, including authoritarian power grabs. They also discuss the challenges of video editing, detailing how they used AI tools like Deepgram and Claude Code to automate transcript generation and video cutting for their four-hour footage. The conversation further explores the philosophical implications of multiverse theories, the nature of AI alignment research, and the evolution of Redwood Research's strategic focus from interpretability to control, including projects like the adversarial robustness paper, the MLAB bootcamp, and the alignment faking paper. They conclude by reflecting on the organization's growth, past mistakes, and future uncertainties.

Key takeaway

For research scientists developing AI safety strategies, recognize that the field's rapid evolution necessitates a shift from indefinitely scalable, superhuman-focused solutions to more immediate, adaptable control mechanisms for powerful but not wildly superhuman AIs. Prioritize understanding and mitigating risks in scenarios with limited political will and resources, focusing on pragmatic interventions that can be iterated quickly and are robust to AI scheming, rather than relying solely on complex, long-term alignment theories.

Key insights

AI safety requires adaptable strategies, robust control mechanisms, and a clear understanding of AI capabilities and failure modes.

Principles

Prioritize simple, iterative research over complex, long-term projects.
Distinguish between preventing misalignment and ensuring control despite misalignment.
Acknowledge the "cursed and complicated" nature of AI model internals.

Method

Automate video editing by using AI for transcription, speaker identification, and generating FFmpeg commands for shot cutting, enabling efficient production without manual editing or external services.

In practice

Use AI for transcription and speaker identification to streamline content production.
Develop internal tools for rapid prototyping and iteration in research.
Focus on modular safety interventions that integrate narrowly into existing systems.

Topics

Redwood Research
AI Safety Strategy
AI Control
Mechanistic Interpretability
AI Risk Assessment

Best for: Research Scientist, AI Researcher, AI Scientist, AI Student

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Redwood Research blog.