LatentWave: JEPA Pretraining for Wireless Foundation Models

2026-06-06 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Wireless Communication & Signal Processing · Depth: Expert, long

Summary

LatentWave is a novel wireless foundation model that employs a Joint-Embedding Predictive Architecture (JEPA) for pretraining on diverse wireless spectrograms and channel state information (CSI). Unlike existing masked input reconstruction methods, LatentWave predicts masked regions in latent space, fostering more transferable representations. Its architecture features per-channel patch embeddings and stochastic channel sampling during pretraining, enabling it to process variable antenna counts and generalize across heterogeneous wireless configurations. The model, with 6.4M parameters for encoders and 880K for the predictor, was pretrained for 240 epochs. Evaluated on RF signal classification, 5G NR positioning, beam prediction, and LoS/NLoS classification, LatentWave performs comparably to the WavesFM baseline. Crucially, the study reveals that masking geometry introduces task-dependent inductive bias: frequency masking significantly improves channel-related tasks like beam prediction (over 11 percentage points gain) and positioning (error reduced from 2.54 m to 2.32 m), while region masking better preserves discriminability for signal classification.

Key takeaway

For Machine Learning Engineers developing wireless AI systems, consider LatentWave's JEPA pretraining to build more adaptable foundation models. You should strategically select masking geometries based on your downstream task: frequency masking enhances performance for channel-related tasks like beam prediction and positioning, while region masking is superior for signal classification. This approach allows a single model to generalize across diverse wireless configurations and antenna counts, reducing development and maintenance costs.

Key insights

JEPA pretraining for wireless foundation models yields transferable representations by predicting latent space masks.

Principles

Masking geometry introduces task-dependent inductive bias.
Latent space prediction captures higher-level semantic features.
Per-channel patch embedding supports variable antenna counts.

Method

LatentWave uses context and target encoders with a predictor to minimize MSE between predicted and target latent representations of masked regions.

In practice

Use frequency masking for channel-related tasks.
Use region masking for signal classification tasks.
Employ stochastic channel sampling for diverse antenna setups.

Topics

Wireless Foundation Models
JEPA
Self-supervised Learning
Masking Strategies
RF Signal Classification
Beam Prediction

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.