Train, Test, Re-evaluate: Schedule-Sensitive Evaluation of Generative Data for Hand Detection

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision · Depth: Advanced, quick

Summary

A study investigates using generative inpainting to create synthetic hand data for augmenting real datasets, addressing the lack of diverse hand appearances (e.g., gloves, tattoos) in public datasets for occupational safety. Researchers trained YOLOv8n hand detectors under six training-and-scheduling regimes, each with three random seeds, evaluating performance using mAP@0.5 and mAP@0.5:0.95 on both a standard real test set and a real-gloves-only split. A two-stage experiment, involving initial training on combined real and synthetic data followed by fine-tuning on real-only data at a lower learning rate, significantly increased mAP@0.5 and reduced the out-of-distribution gap for gloved hands compared to a real-only baseline. Furthermore, a three-stage experiment achieved the highest mAP@0.5:0.95, indicating superior box-tightness preservation. The findings highlight that the utility of synthetic data for safety-critical hand detection is highly dependent on the training procedure, with multi-stage experiments yielding substantial real-deployment benefits from inpainted accessory data.

Key takeaway

For Computer Vision Engineers developing hand detection systems in safety-critical environments, you should integrate generatively inpainted synthetic data into multi-stage training pipelines. This approach significantly improves model robustness to real-world variations like gloves and enhances overall detection accuracy (mAP@0.5 and mAP@0.5:0.95). Consider a two-stage process: initial training on combined real and synthetic data, followed by fine-tuning on real-only data at a lower learning rate, to maximize deployment benefits and close out-of-distribution gaps.

Key insights

Generative inpainting with multi-stage training improves hand detection performance and robustness to appearance variations.

Principles

Method

YOLOv8n hand detectors were trained using six multi-stage regimes, combining real and generatively inpainted synthetic data, then fine-tuned on real data at a lower learning rate.

In practice

Topics

Best for: Research Scientist, Machine Learning Engineer, Computer Vision Engineer, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.