Parameter-Efficient Subspace Decoupling ViT for Mitigating Multi-Task Negative Transfer in Histological Scoring

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Medical Specialties & Subspecialties · Depth: Expert, quick

Summary

A new subspace-decoupled multi-task Vision Transformer (ViT) has been developed to automate histological scoring for Non-Alcoholic Fatty Liver Disease (NAFLD). This method addresses challenges like high annotation costs and "negative transfer" among strongly correlated NAFLD Activity Score (NAS) indicators, specifically steatosis, ballooning, and inflammation, in multi-task learning. The ViT integrates lightweight task-specific Adapters with orthogonality-based constraints, which construct independent feature subspaces for each indicator. This design effectively reduces task interference while retaining shared representations. The approach utilizes a curated multi-task mouse NAFLD histology dataset with expert annotations for all NAS components. Experimental results show improved multi-task stability and generalization, alongside substantially reduced computational cost compared to training separate single-task models.

Key takeaway

For Machine Learning Engineers developing automated histological scoring systems for conditions like NAFLD, this subspace-decoupled multi-task ViT offers a robust solution. You should consider implementing lightweight task-specific Adapters with orthogonality-based constraints to mitigate negative transfer among correlated indicators. This approach improves multi-task stability and generalization while substantially reducing computational costs compared to training separate single-task models, making efficient deployment more feasible.

Key insights

Subspace-decoupled ViT with orthogonal adapters mitigates negative transfer in multi-task histological scoring, improving efficiency.

Principles

Method

Integrate lightweight task-specific Adapters with orthogonality-based constraints into a multi-task Vision Transformer to construct independent feature subspaces.

In practice

Topics

Best for: Computer Vision Engineer, AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.