CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining

2026-05-05 · Source: cs.LG updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Medical Devices & Health Technology · Depth: Expert, extended

Summary

CGM-JEPA is a novel self-supervised predictive pretraining framework designed to learn consistent representations from Continuous Glucose Monitoring (CGM) data, specifically for detecting early metabolic subphenotypes like insulin resistance (IR) and β-cell dysfunction. The framework addresses challenges posed by multi-view physiological states and inconsistent baseline performance across different modalities. CGM-JEPA predicts masked latent representations instead of reconstructing raw values, fostering abstraction. An extension, X-CGM-JEPA, incorporates a masked Glucodensity cross-view objective, adding complementary distributional information. Pretrained on ~389k unlabeled CGM readings from 228 subjects, the models were evaluated on two clinical cohorts (Initial: N=27; Validation: N=17) across cohort generalization, venous-to-CGM transfer, and CGM regimes. X-CGM-JEPA consistently ranked first or second on AUROC for both endpoints across all three evaluation regimes, outperforming the strongest baseline by up to +6.5 AUROC points in cohort generalization and +3.6 points in venous-to-CGM transfer (paired Wilcoxon, p<0.001). The cross-view design also reduced ethnicity AUROC gaps by 25–54% under transfer.

Key takeaway

For Machine Learning Engineers developing diagnostic tools for metabolic health, CGM-JEPA and X-CGM-JEPA offer a robust approach to learning transferable representations from scarce labeled CGM data. You should consider adopting this predictive self-supervised pretraining, especially X-CGM-JEPA, to improve model consistency and fairness across diverse deployment scenarios and demographic subgroups, mitigating performance drops under modality shifts. This method provides a foundation for scalable, non-invasive metabolic risk stratification.

Key insights

Predictive self-supervision with cross-view objectives yields robust CGM representations for metabolic subphenotype prediction.

Principles

Abstract away from single views for transferability.
Predict latent representations, not raw values.
Complementary views stabilize performance under shifts.

Method

CGM-JEPA predicts masked latent CGM patch representations from visible context. X-CGM-JEPA adds an auxiliary objective predicting masked Glucodensity embeddings from the same CGM context, combining temporal and distributional views.

In practice

Use JEPA-style pretraining for CGM time series.
Incorporate Glucodensity for modality shift robustness.
Apply linear probing for downstream classification.

Topics

Continuous Glucose Monitoring
Self-Supervised Learning
Joint Embedding Predictive Architecture
Metabolic Subphenotype Prediction
Insulin Resistance

Code references

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.LG updates on arXiv.org.