#358 How AI Agents Will Work While You Sleep | Ruslan Salakhutdinov, Professor at Carnegie Mellon

2026-05-04 · Source: DataFramed · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Emerging Technologies & Innovation · Depth: Expert, extended

Summary

Ruslan Salakhutdinov, a UPMC Professor of Computer Science at Carnegie Mellon University and former AI research executive at Apple and Meta, discusses the current state and future of AI agents. He highlights significant advancements in coding and computer usage agents, noting their ability to handle tasks requiring hours of execution. A key challenge remains the "credit assignment problem" in long-horizon tasks, where defining verifiable intermediate rewards is difficult. The discussion also covers the emergence of multi-agent systems for improved reasoning and orchestration, the critical need for robust safety mechanisms and guardrails to prevent destructive actions, and lessons from self-driving cars regarding the difficulty of achieving near-100% reliability. Salakhutdinov emphasizes that while full autonomy for critical tasks is distant, agents are increasingly useful for routine, non-critical automation.

Key takeaway

For AI Engineers developing agentic systems, prioritize applications with verifiable outcomes like coding or routine tasks where a 90% success rate is acceptable. Implement layered safeguards, including deterministic security controls and human-in-the-loop workflows, especially for tasks with high consequences. Focus on developing robust uncertainty estimation and clear user interfaces for agent feedback to build trust and manage expectations regarding current reliability limitations.

Key insights

AI agents excel in coding and routine tasks, but achieving reliable autonomy for complex, critical applications remains challenging.

Principles

Verifiable rewards accelerate agent development.
Multi-agent systems enhance complex task execution.
Human-in-the-loop is crucial for non-verifiable tasks.

Method

Training agents for long-horizon tasks requires defining intermediate or partial rewards, often using rubric-based judges, to address the credit assignment problem and improve robustness.

In practice

Automate coding tasks with verifiable unit tests.
Implement multi-agent systems for parallel sub-task execution.
Use human verification for critical agent outputs.

Topics

AI Agents
Long Horizon Tasks
Credit Assignment Problem
Multi-Agent Systems
Agent Safety & Guardrails

Best for: AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by DataFramed.