Lightweight and Interpretable Transformer via Mixed Graph Algorithm Unrolling for Traffic Forecast
Summary
A novel lightweight and interpretable transformer-like neural network is introduced for city traffic forecasting, leveraging a mixed-graph-based optimization algorithm. This model constructs an undirected graph ("G^u") for spatial correlations and a directed graph ("G^d") for temporal relationships, incorporating new ℓ₂ (DGLR) and ℓ₁-norm (DGTV) variational terms to promote signal smoothness on directed graphs. The core is an Alternating Direction Method of Multipliers (ADMM) algorithm, unrolled into a feed-forward network with graph learning modules that mimic self-attention. Experiments on METR-LA, PEMS-BAY, and PEMS03 datasets demonstrate competitive traffic forecast performance while drastically reducing parameter counts, using only 6.4% of parameters compared to transformer-based PDFormer.
Key takeaway
For Machine Learning Engineers developing traffic forecasting solutions, if you are balancing prediction accuracy with model interpretability and computational efficiency, consider adopting algorithm unrolling with mixed graph priors. This approach allows you to build transformer-like networks that achieve competitive performance on datasets like METR-LA and PEMS-BAY while drastically reducing parameter counts (e.g., 6.4% of PDFormer), offering a more transparent and resource-efficient alternative to traditional deep learning models.
Key insights
Algorithm unrolling with mixed graphs and novel directed graph regularizers yields lightweight, interpretable transformers for spatio-temporal forecasting.
Principles
- Algorithm unrolling provides interpretable, efficient deep learning models.
- Mixed graphs (undirected for space, directed for time) enhance spatio-temporal modeling.
- DGLR and DGTV effectively quantify signal smoothness on directed graphs.
Method
An ADMM-based iterative optimization algorithm is unrolled into neural layers, incorporating graph learning modules for dynamic edge weight computation and solving linear systems via Conjugate Gradient.
In practice
- Implement mixed-graph architectures for complex spatio-temporal forecasting tasks.
- Explore algorithm unrolling to reduce model parameters and improve interpretability.
- Apply DGLR and DGTV for robust regularization in directed graph signal processing.
Topics
- Traffic Forecasting
- Algorithm Unrolling
- Transformer Networks
- Graph Signal Processing
- ADMM Optimization
- Spatio-temporal Modeling
Code references
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.