Training a Unitree G1 to Walk w/ Reinforcement Learning

2025-12-19 · Source: sentdex · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Software Development & Engineering · Depth: Advanced, extended

Summary

The author successfully trained a Unitree G1 humanoid robot to walk using reinforcement learning, overcoming significant sim-to-real transfer challenges. The project utilized MJLab for simulation, focusing on achieving "sim-to-sim" fidelity as a precursor to "sim-to-real" deployment. Key modifications included implementing an explicit PD controller and training without linear velocity observations, as the real G1 lacks this sensor. The trained policy, which is a neural network with less than 200,000 parameters, demonstrated stable locomotion on various terrains, including slopes and deep leaves, despite the robot's arm repeatedly detaching. The author also announced joining Lucky Robots, a company developing a new simulator to address scalable physics and scene creation for complex robotic tasks like household chores, which current simulators struggle with due to memory limitations and scene generation complexity.

Key takeaway

For AI Engineers developing embodied AI, focus on closing the sim-to-real gap by ensuring your simulation environment accurately reflects real-world physics and sensor limitations. Prioritize "sim-to-sim" validation to debug policy behavior before deploying to physical hardware, and consider how unobservable real-world data (like linear velocity) impacts your observation space. This approach minimizes unexpected behavior and accelerates successful robot deployment.

Key insights

Achieving sim-to-real transfer for robot locomotion requires meticulous sim-to-sim fidelity and addressing hardware-specific observation gaps.

Principles

Sim-to-sim fidelity is crucial before sim-to-real.
Actuator technology improvements lower sim-to-real barriers.
Neural networks struggle with out-of-distribution data.

Method

Train a general-purpose, steerable gait using velocity-based reinforcement learning in MJLab, employing an explicit PD controller and excluding unobservable real-world sensor data like linear velocity from observations.

In practice

Use MJLab for robust physics simulation.
Implement explicit PD controllers for sim-to-real.
Consider freezing robot arms to reduce jitter.

Topics

Reinforcement Learning
Sim-to-Real Transfer
Humanoid Robotics
Robotics Simulation
Unitree G1

Best for: Robotics Engineer, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by sentdex.