Small Model, Big Results: How a 5MB On-Device NSFW Detector Outperforms Cloud APIs and Passes a…
Summary
Punge, a 5.1 MB on-device NSFW image detector built on a custom-trained YOLO nano model (YOLO26n), consistently outperforms significantly larger cloud-based and open-source models in accuracy, efficiency, and demographic fairness. In suggestive content classification, Punge achieves a 1.6% false positive rate at a 0.70 threshold, outperforming Google Cloud Vision SafeSearch by 14.6 percentage points (16.2% FPR). Against Falcons.ai's Vision Transformer, Punge shows a 3% misclassification rate versus 38%. A fairness audit, reproducing the Leu, Nakashima & Garcia (FAccT 2024) methodology, found Punge's gender false positive disparity ratio at 1.23x, lower than the 1.0x to 6.4x range of models in the original study, and its skin tone ratio at 0.89x, indicating near-perfect parity. This performance is attributed to its architectural choice of detecting anatomical shapes rather than classifying whole images, enabling on-device processing without server uploads.
Key takeaway
For AI Engineers developing content moderation systems, especially for mobile or privacy-sensitive applications, you should evaluate object detection architectures like YOLO for NSFW detection. This approach offers superior accuracy on suggestive content and significantly reduced demographic bias compared to whole-image classifiers, while enabling efficient on-device processing and user control over sensitivity thresholds. Prioritize localized anatomical detection to avoid the inherent biases of global image representations.
Key insights
Small, on-device object detection models can surpass large cloud APIs in NSFW accuracy and fairness by focusing on anatomical shapes.
Principles
- Localized detection improves accuracy and reduces bias.
- Smaller models can achieve superior performance.
- On-device processing enhances privacy and control.
Method
Punge uses a custom-trained YOLO nano model (YOLO26n) to detect explicit anatomical regions via bounding boxes, flagging an image if any detection crosses a confidence threshold, rather than classifying the entire image.
In practice
- Tune YOLO confidence thresholds for sensitivity/precision.
- Consider object detection for privacy-sensitive content.
- Prioritize task-specific training over general large models.
Topics
- NSFW Detection
- On-Device AI
- YOLO Models
- Demographic Bias
- Fairness Auditing
Best for: AI Engineer, Computer Vision Engineer, AI Scientist, Machine Learning Engineer, AI Researcher, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AI Advances - Medium.