Sample Selection Using Multi-Task Autoencoders in Federated Learning with Non-IID Data

2026-04-30 · Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A novel sample selection method for federated learning (FL) with non-IID data is proposed, utilizing a multi-task autoencoder to assess sample contributions via loss and feature analysis. The approach integrates unsupervised outlier detection techniques like one-class support vector machine (OCSVM), isolation forest (IF), and adaptive loss threshold (AT) to filter noisy samples on client devices, managed by a central server. Additionally, a multi-class deep support vector data description (SVDD) loss, also centrally controlled, is introduced to refine feature-based sample selection. Validated on CIFAR10 and MNIST datasets under varying client numbers, non-IID distributions, and noise levels up to 40%, the methods demonstrate significant accuracy improvements. Loss-based selection achieved gains up to 7.02% on CIFAR10 with OCSVM and 1.83% on MNIST with AT, while federated SVDD loss further improved feature-based selection by up to 0.99% on CIFAR10 with OCSVM.

Key takeaway

For research scientists developing federated learning systems, integrating multi-task autoencoders with outlier detection methods like OCSVM or AT can significantly improve model accuracy, especially with non-IID data and high noise levels. You should consider implementing the proposed federated SVDD loss to further refine feature-based sample selection, enhancing overall model robustness and efficiency in real-world deployments.

Key insights

Multi-task autoencoders and outlier detection enhance federated learning performance by filtering noisy, redundant, or malicious samples.

Principles

Sample contribution can be estimated via loss and feature analysis.
Unsupervised outlier detection improves FL robustness.
Centralized control can manage client-side sample filtering.

Method

A multi-task autoencoder estimates sample contributions. Unsupervised outlier detection (OCSVM, IF, AT) filters noisy samples. A federated SVDD loss enhances feature-based selection, all centrally managed.

In practice

Apply OCSVM for significant CIFAR10 accuracy gains.
Use AT for MNIST accuracy improvements.
Integrate federated SVDD loss for feature-based refinement.

Topics

Federated Learning
Sample Selection
Multi-Task Autoencoders
Non-IID Data
Outlier Detection

Code references

eardic/FL_DPQS

Best for: Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.