DiRecT: Safe Diffusion-Based Planning via Receding-Horizon Denoising
Summary
DiRecT (Diffusion-based planning via Receding-horizon denoising with Terminal constraints) is a new training-free algorithm designed to enhance safety in diffusion-based planning and control models. Addressing the limitation of existing methods that overconstrain intermediate denoising samples by enforcing feasibility on noisy data, DiRecT enforces constraints exclusively on the final clean trajectory. This approach prevents distortion of learned diffusion dynamics. Inspired by model predictive control, DiRecT employs a principled receding-horizon surrogate for constrained stochastic optimal control (SOC), efficiently separating stochastic denoising from constraint satisfaction. The algorithm progressively steers samples toward feasible final trajectories. DiRecT is highly flexible, allowing integration of off-the-shelf optimizers, environment dynamics priors, and additional soft rewards. Experiments on safe planning benchmarks demonstrate DiRecT substantially improves deployment safety and task performance compared to current diffusion-based planning baselines.
Key takeaway
For Machine Learning Engineers developing diffusion models for safety-critical planning, DiRecT offers a robust solution to improve deployment safety and task performance. You should consider integrating this training-free algorithm to enforce constraints solely on final trajectories, preventing the over-constrainment and quality degradation seen with intermediate sample enforcement. This approach allows for cleaner separation of denoising and constraint satisfaction, enhancing model reliability in real-world applications.
Key insights
DiRecT improves diffusion model safety by enforcing constraints only on final trajectories, avoiding intermediate denoising over-constrainment.
Principles
- Enforce constraints solely on the final clean sample.
- Separate stochastic denoising from constraint satisfaction.
- Utilize receding-horizon surrogates for intractable SOC.
Method
DiRecT uses a training-free, receding-horizon stochastic optimal control (SOC) approach. It progressively steers samples towards feasible final trajectories by separating stochastic denoising from constraint satisfaction, avoiding intermediate sample over-constrainment.
In practice
- Integrate off-the-shelf or domain-specific optimizers.
- Incorporate priors over environment dynamics.
- Optimize additional soft rewards.
Topics
- Diffusion Models
- Safe Planning
- Receding-Horizon Control
- Stochastic Optimal Control
- Constraint Satisfaction
- Machine Learning Safety
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Robotics Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.