LadderMan: Learning Humanoid Perceptive Ladder Climbing
Summary
LadderMan is a unified system enabling Unitree G1 humanoid robots to robustly climb diverse ladders and perform on-ladder manipulation. It utilizes a scalable two-stage learning pipeline: hybrid motion tracking learns multiple climbing experts from a single reference motion, which are then distilled into a unified depth-based visuomotor policy via hybrid imitation and reinforcement learning. To facilitate zero-shot real-world deployment, LadderMan employs the Fast-FoundationStereo vision foundation model to bridge the sim-to-real gap in depth perception. A separate dual-agent manipulation policy allows stable teleoperated tasks, such as adjusting paintings or replacing light bulbs, while maintaining balance. Experiments demonstrate robust climbing across varying ladder geometries and materials, achieving human-comparable speeds of approximately 3.4 seconds per rung.
Key takeaway
For robotics engineers developing humanoid systems for industrial or maintenance tasks, LadderMan offers a robust framework for complex multi-contact locomotion. You can achieve reliable ladder climbing and stable on-ladder manipulation, even with diverse ladder geometries, by adopting its two-stage learning and sim-to-real perception bridging techniques. Consider its dual-agent approach for integrating manipulation without compromising balance.
Key insights
LadderMan enables robust humanoid ladder climbing and manipulation through perceptive, sim-to-real learning.
Principles
- Hybrid motion tracking learns diverse expert policies from a single reference motion.
- Vision foundation models effectively bridge sim-to-real depth perception gaps.
- Dual-agent learning decouples lower-body stabilization from upper-body manipulation.
Method
A two-stage pipeline learns expert climbing policies via hybrid motion tracking, then distills them into a visuomotor policy using hybrid imitation/RL, enhanced by a VFM for depth perception.
In practice
- Deploy zero-shot sim-to-real climbing on humanoids like the Unitree G1.
- Perform stable on-ladder manipulation via teleoperation.
- Use rung-focused masking to improve depth perception robustness.
Topics
- Humanoid Robotics
- Ladder Climbing
- Reinforcement Learning
- Sim-to-Real Transfer
- Vision Foundation Models
- Loco-Manipulation
Code references
Best for: Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Robotics Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.