Lung-SRAD: Spectral-Aware Regularized Audio DASS with Dual-Axis Patch-Mix Contrastive Learning for Respiratory Sound Classification

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, AI in Healthcare · Depth: Expert, quick

Summary

Lung-SRAD introduces a novel approach for Respiratory Sound Classification (RSC) that leverages State Space Models (SSMs) as an alternative to traditional CLS-token driven self-attention architectures like the Audio Spectrogram Transformer (AST). Existing AST models often exhibit low-pass filtering, diminishing sensitivity to localized abnormal respiratory patterns. Lung-SRAD addresses this by utilizing SSMs, which demonstrate stronger preservation of mid-to-high spatial-frequency components in intermediate representations. The method incorporates spectral-aware layer regularization via Gaussian convolution and proposes Dual-Axis Patch-Mix contrastive learning specifically for SSM-based audio models. This combined strategy achieved a 64.48% score on the ICBHI benchmark, surpassing the AST baseline by 5%. Code is publicly available.

Key takeaway

For Machine Learning Engineers developing respiratory sound classification systems, Lung-SRAD offers a compelling alternative to AST models. If your current models struggle with localized abnormal patterns due to low-pass filtering, you should investigate integrating State Space Models and spectral-aware regularization. This approach, which achieved 64.48% on ICBHI, suggests a path to significantly improve diagnostic accuracy by better preserving critical mid-to-high frequency audio details. Consider exploring the provided code to adapt these techniques.

Key insights

Lung-SRAD uses State Space Models and spectral-aware regularization to improve respiratory sound classification by preserving high-frequency details.

Principles

Method

Lung-SRAD employs Distilled Audio State Space models, applies Gaussian convolution for spectral-aware layer regularization, and integrates Dual-Axis Patch-Mix contrastive learning for robust representation.

In practice

Topics

Code references

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.