Classifying by Proxy: Explainable and Reproducible Ensemble of Proxy Tasks for Child Sexual Abuse Imagery Classification

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision & Pattern Recognition, Cybersecurity & Data Privacy · Depth: Advanced, quick

Summary

Child Sexual Abuse Imagery (CSAI) classification systems are crucial for law enforcement and content removal, yet their development faces significant hurdles. Research is hampered by highly sensitive data, restrictive access regimes, and a lack of model explainability, making studies hard to reproduce, distribute, compare, or validate. A novel approach applies an ensemble of Proxy Tasks, which are tasks correlating to CSAI classification, directly to real CSAI for the first time. This method, featuring a new selection of relevant Proxy Tasks and training adaptations, significantly improves reproducibility, explainability, and security for system distribution. The final model achieves competitive results, demonstrating 91.9% balanced accuracy on the RCPD dataset. This ensemble also surpasses the best-in-class DINO representation learning model in accuracy and uniquely provides classification explanations, a feature often missing in single deep learning models.

Key takeaway

For AI Scientists developing sensitive content classification systems, particularly for law enforcement, you should prioritize explainable and reproducible methods. This research demonstrates that an ensemble of Proxy Tasks can achieve high accuracy (91.9% on RCPD) while providing crucial classification explanations, a significant advantage over traditional deep learning models like DINO. Consider integrating proxy task ensembles into your development pipeline to address data access restrictions and meet operational demands for transparency and validation.

Key insights

An ensemble of Proxy Tasks enhances CSAI classification with improved explainability, reproducibility, and security.

Principles

Method

The method involves selecting relevant Proxy Tasks from CSAI literature, adapting an original framework, and ensembling them for classification on real CSAI data.

In practice

Topics

Best for: Research Scientist, AI Scientist, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.