Lightweight and Interpretable Transformer via Mixed Graph Algorithm Unrolling for Traffic Forecast

2026-06-12 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

A novel lightweight and interpretable transformer-like neural network is introduced for city traffic forecasting, leveraging a mixed-graph-based optimization algorithm. This model constructs an undirected graph ("G^u") for spatial correlations and a directed graph ("G^d") for temporal relationships, incorporating new ℓ₂ (DGLR) and ℓ₁-norm (DGTV) variational terms to promote signal smoothness on directed graphs. The core is an Alternating Direction Method of Multipliers (ADMM) algorithm, unrolled into a feed-forward network with graph learning modules that mimic self-attention. Experiments on METR-LA, PEMS-BAY, and PEMS03 datasets demonstrate competitive traffic forecast performance while drastically reducing parameter counts, using only 6.4% of parameters compared to transformer-based PDFormer.

Key takeaway

For Machine Learning Engineers developing traffic forecasting solutions, if you are balancing prediction accuracy with model interpretability and computational efficiency, consider adopting algorithm unrolling with mixed graph priors. This approach allows you to build transformer-like networks that achieve competitive performance on datasets like METR-LA and PEMS-BAY while drastically reducing parameter counts (e.g., 6.4% of PDFormer), offering a more transparent and resource-efficient alternative to traditional deep learning models.

Key insights

Algorithm unrolling with mixed graphs and novel directed graph regularizers yields lightweight, interpretable transformers for spatio-temporal forecasting.

Principles

Algorithm unrolling provides interpretable, efficient deep learning models.
Mixed graphs (undirected for space, directed for time) enhance spatio-temporal modeling.
DGLR and DGTV effectively quantify signal smoothness on directed graphs.

Method

An ADMM-based iterative optimization algorithm is unrolled into neural layers, incorporating graph learning modules for dynamic edge weight computation and solving linear systems via Conjugate Gradient.

In practice

Implement mixed-graph architectures for complex spatio-temporal forecasting tasks.
Explore algorithm unrolling to reduce model parameters and improve interpretability.
Apply DGLR and DGTV for robust regularization in directed graph signal processing.

Topics

Traffic Forecasting
Algorithm Unrolling
Transformer Networks
Graph Signal Processing
ADMM Optimization
Spatio-temporal Modeling

Code references

SingularityUndefined/Unrolling-GSP-STForecast

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.