LUNA: Learning Universal 3D Human Animation Beyond Skinning

2026-06-30 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision & Pattern Recognition · Depth: Expert, quick

Summary

LUNA is a novel LBS-free universal neural animation model designed to create photorealistic, animatable 3D human avatars directly from various 2D controls, including images, keypoints, and sketches, without relying on traditional Linear Blend Skinning (LBS) or parametric body models. This approach bypasses explicit body fitting, which often introduces artifacts and limits expressivity. At its core, LUNA employs a transformer-based motion regressor to disentangle global rigid motion from fine-grained local dynamics, capturing both coherent movement and subtle non-rigid effects. It utilizes hybrid supervision, distilling structural priors from an LBS teacher and incorporating a loss function that enables training on both limited fitted data and large unlabeled in-the-wild videos. LUNA achieves competitive visual fidelity, delivers realistic human motion, and demonstrates zero-shot cross-identity generalization across diverse driving modalities, making it the first end-to-end 3D animatable model supporting implicit 2D driving.

Key takeaway

For Computer Vision Engineers developing photorealistic 3D human avatars, LUNA offers a significant advancement by eliminating the constraints and artifacts of Linear Blend Skinning. You should consider adopting LBS-free neural animation models like LUNA to achieve greater expressivity and zero-shot generalization from diverse 2D inputs. This approach simplifies the pipeline by bypassing explicit body fitting, enabling more realistic and versatile character animation for your projects.

Key insights

LUNA enables LBS-free 3D human animation from diverse 2D inputs using neural Gaussian deformations and a transformer-based motion regressor.

Principles

Disentangle global rigid from local dynamics.
Use hybrid supervision for 2D-to-3D lifting.
Bypass explicit body fitting for expressivity.

Method

LUNA maps 2D controls to 3D Gaussian deformations via a transformer-based motion regressor, using hybrid supervision with an LBS teacher and a loss for varied data.

In practice

Generate avatars from images, keypoints, sketches.
Animate unseen characters zero-shot.
Create realistic human motion without LBS artifacts.

Topics

3D Human Animation
Neural Avatars
Gaussian Splatting
Linear Blend Skinning
Transformer Models
2D-to-3D Lifting
Computer Vision

Best for: Research Scientist, AI Scientist, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.