CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining

· Source: cs.LG updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Medical Devices & Health Technology · Depth: Expert, extended

Summary

CGM-JEPA is a novel self-supervised predictive pretraining framework designed to learn consistent representations from Continuous Glucose Monitoring (CGM) data, specifically for detecting early metabolic subphenotypes like insulin resistance (IR) and β-cell dysfunction. The framework addresses challenges posed by multi-view physiological states and inconsistent baseline performance across different modalities. CGM-JEPA predicts masked latent representations instead of reconstructing raw values, fostering abstraction. An extension, X-CGM-JEPA, incorporates a masked Glucodensity cross-view objective, adding complementary distributional information. Pretrained on ~389k unlabeled CGM readings from 228 subjects, the models were evaluated on two clinical cohorts (Initial: N=27; Validation: N=17) across cohort generalization, venous-to-CGM transfer, and CGM regimes. X-CGM-JEPA consistently ranked first or second on AUROC for both endpoints across all three evaluation regimes, outperforming the strongest baseline by up to +6.5 AUROC points in cohort generalization and +3.6 points in venous-to-CGM transfer (paired Wilcoxon, p<0.001). The cross-view design also reduced ethnicity AUROC gaps by 25–54% under transfer.

Key takeaway

For Machine Learning Engineers developing diagnostic tools for metabolic health, CGM-JEPA and X-CGM-JEPA offer a robust approach to learning transferable representations from scarce labeled CGM data. You should consider adopting this predictive self-supervised pretraining, especially X-CGM-JEPA, to improve model consistency and fairness across diverse deployment scenarios and demographic subgroups, mitigating performance drops under modality shifts. This method provides a foundation for scalable, non-invasive metabolic risk stratification.

Key insights

Predictive self-supervision with cross-view objectives yields robust CGM representations for metabolic subphenotype prediction.

Principles

Method

CGM-JEPA predicts masked latent CGM patch representations from visible context. X-CGM-JEPA adds an auxiliary objective predicting masked Glucodensity embeddings from the same CGM context, combining temporal and distributional views.

In practice

Topics

Code references

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.LG updates on arXiv.org.