David vs. Goliath in Next Activity Prediction: Argmax vs. LSTM, Transformer, and LLM

2026-06-14 · Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Advanced, quick

Summary

A systematic benchmark study addresses the lack of direct comparisons between advanced deep learning models and simpler baselines for Next Activity Prediction (NAP) in Predictive Process Monitoring (PPM). The research evaluates vocabulary-adapted Large Language Models (LLMs), Transformers trained from scratch, LLM-distilled Transformers, and LSTMs against a simple counting-based argmax baseline. Conducted across seven real-life event logs, the findings reveal that pretraining offers no consistent performance improvement over training models from scratch. Furthermore, model size demonstrates minimal impact on prediction accuracy. Notably, the argmax baseline frequently matches or closely approaches the performance of billion-parameter LLMs on most datasets, challenging assumptions about model complexity and efficacy in NAP.

Key takeaway

For Machine Learning Engineers developing Next Activity Prediction (NAP) systems, this research suggests re-evaluating the necessity of complex models. If you are considering large language models or sophisticated Transformers for sequence prediction, first benchmark a simple counting-based argmax baseline. Your team could achieve comparable performance on many datasets with significantly reduced computational overhead and development complexity, potentially freeing up resources for other critical tasks.

Key insights

Simple argmax baselines can achieve performance comparable to complex LLMs in Next Activity Prediction.

Principles

Pretraining offers no consistent improvement over training from scratch.
Model size has little effect on Next Activity Prediction performance.

Method

A systematic benchmark compared vocabulary-adapted LLMs, scratch-trained Transformers, LLM-distilled Transformers, LSTMs, and an argmax baseline across seven event logs.

In practice

Consider argmax for NAP before complex models.
Re-evaluate pretraining benefits for sequence prediction tasks.

Topics

Next Activity Prediction
Predictive Process Monitoring
Large Language Models
Transformers
LSTMs
Argmax Baseline
Event Logs

Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.