Privacy-Preserving Clothing Classification using Vision Transformer for Thermal Comfort Estimation

2026-04-30 · Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Engineering & Applied Sciences · Depth: Advanced, medium

Summary

A privacy-preserving clothing classification scheme is introduced for secure occupant-centric control (OCC) systems, specifically for thermal comfort estimation in HVAC. This method utilizes a Vision Transformer (ViT) model, such as ViT-S/16, combined with key-based image encryption and model transformation. Unlike conventional pixel-based encryption methods that cause severe accuracy degradation, the proposed ViT-based scheme maintains high classification accuracy on encrypted images, showing no degradation from plain images. Experiments on the DeepFashion dataset, categorized into four clothing insulation levels, demonstrated an average accuracy of 95.65% on encrypted images, significantly outperforming the conventional pixel-based method which dropped from 93.51% to 83.34% with ResNet-18. The ViT-S/16 model, with 22.0M parameters, is also suitable for edge implementation.

Key takeaway

For research scientists developing privacy-preserving computer vision systems, you should consider Vision Transformers (ViT) as a robust alternative to conventional pixel-based encryption. The ViT-based scheme demonstrates superior accuracy retention (95.65%) on encrypted images compared to significant degradation seen with pixel-based methods, making it highly effective for applications like secure HVAC thermal comfort estimation where data privacy is critical. Explore ViT-S/16 for edge deployments due to its parameter efficiency.

Key insights

ViT-based encryption maintains high accuracy for privacy-preserving image classification, unlike pixel-based methods.

Principles

ViT's self-attention tolerates spatial permutations.
Privacy-preserving methods often trade accuracy for security.

Method

Fine-tune ViT, encrypt model parameters with a secret key, upload to cloud. User encrypts image with shared key, sends to cloud for inference.

In practice

Use ViT for privacy-preserving image tasks.
Encrypt images via pixel shuffling and block scrambling.

Topics

Privacy-Preserving Classification
Vision Transformer
Thermal Comfort Estimation
HVAC Control Systems
Clothing Insulation

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.