FCBV-Net: Category-Level Robotic Garment Smoothing via Feature-Conditioned Bimanual Value Prediction

2026-04-16 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

FCBV-Net, or Feature-Conditioned Bimanual Value Network, is a novel deep learning architecture designed to improve category-level generalization for robotic garment smoothing. It addresses challenges like high dimensionality and intra-category variations by operating on 3D point clouds. The network conditions bimanual action value prediction on pre-trained, frozen dense geometric features, which ensures robustness to variations within garment categories. Trainable downstream components then learn a task-specific policy using these static features. In simulated GarmentLab experiments using the CLOTH3D dataset, FCBV-Net achieved superior category-level generalization, showing only an 11.5% efficiency drop (Steps80) on unseen garments, significantly outperforming a 2D image-based baseline's 96.2% drop. It also achieved 89% final coverage, surpassing a 3D correspondence-based baseline's 83% coverage.

Key takeaway

For research scientists developing robotic manipulation systems for deformable objects, FCBV-Net's approach of decoupling geometric feature learning from policy learning offers a robust strategy. You should consider pre-training and freezing dense 3D geometric features to enhance category-level generalization, especially when dealing with significant intra-category variations. This method can lead to more efficient and effective policies for tasks like garment smoothing, reducing overfitting and improving performance on unseen items.

Key insights

Decoupling geometric understanding from action value learning enhances robotic garment manipulation generalization.

Principles

Frozen pre-trained features improve generalization.
3D point clouds offer robust geometric input.
Decoupling geometric and policy learning is effective.

Method

FCBV-Net uses a PointNet++ backbone for frozen dense geometric features, then trainable modules (ValueDecoderPN++, Primitive Selection Head, Final Bimanual Value Head) predict bimanual action values and select optimal primitives.

In practice

Utilize 3D point clouds for deformable object state.
Pre-train geometric features and freeze them.
Combine human annotations with self-supervised data.

Topics

FCBV-Net
Robotic Garment Smoothing
Category-Level Generalization
Bimanual Manipulation
3D Point Clouds

Best for: Research Scientist, AI Scientist, Robotics Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.