Lightweight and Interpretable Transformer via Mixed Graph Algorithm Unrolling for Traffic Forecast

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

A novel lightweight and interpretable transformer-like neural network is introduced for city traffic forecasting, leveraging a mixed-graph-based optimization algorithm. This model constructs an undirected graph ("G^u") for spatial correlations and a directed graph ("G^d") for temporal relationships, incorporating new ℓ₂ (DGLR) and ℓ₁-norm (DGTV) variational terms to promote signal smoothness on directed graphs. The core is an Alternating Direction Method of Multipliers (ADMM) algorithm, unrolled into a feed-forward network with graph learning modules that mimic self-attention. Experiments on METR-LA, PEMS-BAY, and PEMS03 datasets demonstrate competitive traffic forecast performance while drastically reducing parameter counts, using only 6.4% of parameters compared to transformer-based PDFormer.

Key takeaway

For Machine Learning Engineers developing traffic forecasting solutions, if you are balancing prediction accuracy with model interpretability and computational efficiency, consider adopting algorithm unrolling with mixed graph priors. This approach allows you to build transformer-like networks that achieve competitive performance on datasets like METR-LA and PEMS-BAY while drastically reducing parameter counts (e.g., 6.4% of PDFormer), offering a more transparent and resource-efficient alternative to traditional deep learning models.

Key insights

Algorithm unrolling with mixed graphs and novel directed graph regularizers yields lightweight, interpretable transformers for spatio-temporal forecasting.

Principles

Method

An ADMM-based iterative optimization algorithm is unrolled into neural layers, incorporating graph learning modules for dynamic edge weight computation and solving linear systems via Conjugate Gradient.

In practice

Topics

Code references

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.