One-Shot Novel View and Pose Human Image Synthesis via 3D Prior Guided Diffusion Model

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision & Pattern Recognition · Depth: Expert, quick

Summary

A novel approach for one-shot novel view and pose human image synthesis is proposed, utilizing a conditional denoising diffusion model. This model addresses limitations of existing 2D pose transfer and generalizable human NeRF methods by dividing the synthesis problem into a sequence of conditional denoising steps. It introduces 3D human priors, specifically a 3D normal map and a color prompt, as geometry and color conditions to generate humans with complex and arbitrary poses. This enables high-quality synthesis, including the accurate recovery of occluded or invisible human parts. Furthermore, a self-reconstruction based customized refinement enhances fine details when applied to novel persons. Experimental results on public datasets demonstrate significant performance improvements and better generalization ability compared to previous methods. The code will be publicly available at https://github.com/Yankeegsj/3DPGDM.

Key takeaway

For Computer Vision Engineers developing human image synthesis systems, this 3D prior-guided diffusion model offers a robust solution for one-shot novel view and pose generation, overcoming limitations of 2D pose transfer and generalizable NeRFs. You should explore its architecture, particularly the 3D normal map and color prompt integration, to achieve high-quality synthesis of complex poses and occluded parts, enhancing realism and generalization across diverse subjects.

Key insights

A 3D prior-guided conditional diffusion model enables high-quality one-shot novel view and pose human image synthesis.

Principles

Method

A conditional denoising diffusion model synthesizes novel human views/poses by iteratively denoising, guided by 3D normal maps and color prompts, followed by self-reconstruction refinement for new subjects.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.