Sample Selection Using Multi-Task Autoencoders in Federated Learning with Non-IID Data
Summary
A novel sample selection method for federated learning (FL) with non-IID data is proposed, utilizing a multi-task autoencoder to assess sample contributions via loss and feature analysis. The approach integrates unsupervised outlier detection techniques like one-class support vector machine (OCSVM), isolation forest (IF), and adaptive loss threshold (AT) to filter noisy samples on client devices, managed by a central server. Additionally, a multi-class deep support vector data description (SVDD) loss, also centrally controlled, is introduced to refine feature-based sample selection. Validated on CIFAR10 and MNIST datasets under varying client numbers, non-IID distributions, and noise levels up to 40%, the methods demonstrate significant accuracy improvements. Loss-based selection achieved gains up to 7.02% on CIFAR10 with OCSVM and 1.83% on MNIST with AT, while federated SVDD loss further improved feature-based selection by up to 0.99% on CIFAR10 with OCSVM.
Key takeaway
For research scientists developing federated learning systems, integrating multi-task autoencoders with outlier detection methods like OCSVM or AT can significantly improve model accuracy, especially with non-IID data and high noise levels. You should consider implementing the proposed federated SVDD loss to further refine feature-based sample selection, enhancing overall model robustness and efficiency in real-world deployments.
Key insights
Multi-task autoencoders and outlier detection enhance federated learning performance by filtering noisy, redundant, or malicious samples.
Principles
- Sample contribution can be estimated via loss and feature analysis.
- Unsupervised outlier detection improves FL robustness.
- Centralized control can manage client-side sample filtering.
Method
A multi-task autoencoder estimates sample contributions. Unsupervised outlier detection (OCSVM, IF, AT) filters noisy samples. A federated SVDD loss enhances feature-based selection, all centrally managed.
In practice
- Apply OCSVM for significant CIFAR10 accuracy gains.
- Use AT for MNIST accuracy improvements.
- Integrate federated SVDD loss for feature-based refinement.
Topics
- Federated Learning
- Sample Selection
- Multi-Task Autoencoders
- Non-IID Data
- Outlier Detection
Code references
Best for: Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.