A self-supervised electrocardiogram foundation model for empowering cardiovascular disease prediction and genetic factor discovery

2026-04-27 · Source: Machine learning : nature.com subject feeds · Field: Health & Wellbeing — Artificial Intelligence & Machine Learning, Medical Specialties & Subspecialties, Medical Devices & Health Technology · Depth: Expert, short

Summary

Researchers developed ECG-LFM, a self-supervised Electrocardiogram Large-scale Foundation Model, to improve cardiovascular disease (CVD) prediction and genetic factor discovery. The model was pre-trained on over ten million 12-lead ECGs from multiple datasets, integrating contrastive learning with masked language modeling to capture both global and fine-grained ECG patterns. When fine-tuned for eight CVD types, ECG-LFM achieved an average AUROC of 0.930 across multiple datasets, outperforming existing methods. The model's derived features (EDFs) represent known CVD biomarkers, demonstrating high interpretability. Furthermore, applying EDFs in genome-wide association studies identified 24 significant single nucleotide polymorphisms (SNPs) associated with ECG, including 8 novel findings, with P-value < 5×10-8 and LD r2 < 0.01. Mendelian randomization also indicated causal relationships between 2 CVDs and 4 EDFs.

Key takeaway

For AI Scientists and Machine Learning Engineers developing diagnostic tools, ECG-LFM demonstrates a robust approach to improving CVD prediction and interpretability. Your teams should consider self-supervised foundation models pre-trained on large-scale medical data to enhance model performance and enable novel genetic insights. This method can lead to more accurate diagnoses and a deeper understanding of disease mechanisms.

Key insights

A self-supervised ECG foundation model improves CVD prediction and uncovers novel genetic associations.

Principles

Self-supervision enhances ECG representation.
Integrate global and fine-grained pattern capture.

Method

ECG-LFM uses contrastive learning and masked language modeling for self-supervised pre-training on millions of 12-lead ECGs, then fine-tunes for CVD prediction and applies derived features to GWAS and Mendelian randomization.

In practice

Use foundation models for medical signal analysis.
Apply EDFs for genetic association studies.

Topics

ECG Foundation Model
Cardiovascular Disease Prediction
Self-supervised Learning
Genome-Wide Association Study
Genetic Factor Discovery

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.