A self-supervised electrocardiogram foundation model for empowering cardiovascular disease prediction and genetic factor discovery

· Source: Machine learning : nature.com subject feeds · Field: Health & Wellbeing — Artificial Intelligence & Machine Learning, Medical Specialties & Subspecialties, Medical Devices & Health Technology · Depth: Expert, short

Summary

Researchers developed ECG-LFM, a self-supervised Electrocardiogram Large-scale Foundation Model, to improve cardiovascular disease (CVD) prediction and genetic factor discovery. The model was pre-trained on over ten million 12-lead ECGs from multiple datasets, integrating contrastive learning with masked language modeling to capture both global and fine-grained ECG patterns. When fine-tuned for eight CVD types, ECG-LFM achieved an average AUROC of 0.930 across multiple datasets, outperforming existing methods. The model's derived features (EDFs) represent known CVD biomarkers, demonstrating high interpretability. Furthermore, applying EDFs in genome-wide association studies identified 24 significant single nucleotide polymorphisms (SNPs) associated with ECG, including 8 novel findings, with P-value < 5×10-8 and LD r2 < 0.01. Mendelian randomization also indicated causal relationships between 2 CVDs and 4 EDFs.

Key takeaway

For AI Scientists and Machine Learning Engineers developing diagnostic tools, ECG-LFM demonstrates a robust approach to improving CVD prediction and interpretability. Your teams should consider self-supervised foundation models pre-trained on large-scale medical data to enhance model performance and enable novel genetic insights. This method can lead to more accurate diagnoses and a deeper understanding of disease mechanisms.

Key insights

A self-supervised ECG foundation model improves CVD prediction and uncovers novel genetic associations.

Principles

Method

ECG-LFM uses contrastive learning and masked language modeling for self-supervised pre-training on millions of 12-lead ECGs, then fine-tunes for CVD prediction and applies derived features to GWAS and Mendelian randomization.

In practice

Topics

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.