David vs. Goliath in Next Activity Prediction: Argmax vs. LSTM, Transformer, and LLM
Summary
A systematic benchmark study addresses the lack of direct comparisons between advanced deep learning models and simpler baselines for Next Activity Prediction (NAP) in Predictive Process Monitoring (PPM). The research evaluates vocabulary-adapted Large Language Models (LLMs), Transformers trained from scratch, LLM-distilled Transformers, and LSTMs against a simple counting-based argmax baseline. Conducted across seven real-life event logs, the findings reveal that pretraining offers no consistent performance improvement over training models from scratch. Furthermore, model size demonstrates minimal impact on prediction accuracy. Notably, the argmax baseline frequently matches or closely approaches the performance of billion-parameter LLMs on most datasets, challenging assumptions about model complexity and efficacy in NAP.
Key takeaway
For Machine Learning Engineers developing Next Activity Prediction (NAP) systems, this research suggests re-evaluating the necessity of complex models. If you are considering large language models or sophisticated Transformers for sequence prediction, first benchmark a simple counting-based argmax baseline. Your team could achieve comparable performance on many datasets with significantly reduced computational overhead and development complexity, potentially freeing up resources for other critical tasks.
Key insights
Simple argmax baselines can achieve performance comparable to complex LLMs in Next Activity Prediction.
Principles
- Pretraining offers no consistent improvement over training from scratch.
- Model size has little effect on Next Activity Prediction performance.
Method
A systematic benchmark compared vocabulary-adapted LLMs, scratch-trained Transformers, LLM-distilled Transformers, LSTMs, and an argmax baseline across seven event logs.
In practice
- Consider argmax for NAP before complex models.
- Re-evaluate pretraining benefits for sequence prediction tasks.
Topics
- Next Activity Prediction
- Predictive Process Monitoring
- Large Language Models
- Transformers
- LSTMs
- Argmax Baseline
- Event Logs
Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.