[R] Analysis of 350+ ML competitions in 2025
Summary
An analysis of over 350 machine learning competitions from 2025, compiled by mlcontests.com, reveals key trends in winning solutions across various data types. While gradient-boosted decision trees (GBDTs) like XGBoost and LightGBM still dominate tabular data, AutoML packages (AutoGluon) and tabular foundation models (TabPFN, TabM) are emerging. Compute budgets are increasing, with one team using 512 H100s for 48 hours (estimated $60k cloud cost), though free compute options remain viable. Qwen2.5 and Qwen3 models were prevalent in language/reasoning tasks, largely replacing BERT-style models. Transformer-based models surpassed CNNs in vision competitions for the first time, and OpenAI's Whisper was frequently fine-tuned for audio speech tasks. PyTorch was used in 98% of deep learning solutions, with 20% also using PyTorch Lightning. Polars and JAX saw minimal adoption among winners.
Key takeaway
For AI Engineers developing competitive ML solutions, you should prioritize PyTorch for deep learning and explore Qwen models for language tasks. While GBDTs are still strong for tabular data, investigate AutoGluon or TabPFN for potential advantages. Be prepared for increasing compute demands, but also note that efficient inference tools like vLLM and Unsloth are crucial for optimizing resource usage.
Key insights
ML competition trends show shifts towards foundation models and increased compute, while GBDTs and PyTorch maintain strong positions.
Principles
- GBDTs remain strong for tabular data.
- PyTorch is the dominant deep learning framework.
- Efficiency tools like vLLM and Unsloth are key.
Method
Winning solutions for language tasks often fine-tune Qwen models, while audio speech tasks commonly fine-tune OpenAI's Whisper. Vision tasks increasingly favor Transformer-based models over CNNs.
In practice
- Consider AutoGluon or TabPFN for tabular data.
- Utilize Qwen models for text-related competitions.
- Employ vLLM or Unsloth for efficient inference/fine-tuning.
Topics
- Machine Learning Competitions
- Tabular Models
- Compute Resources
- Language Models
- Transformer Models
Best for: AI Engineer, NLP Engineer, Computer Vision Engineer, Machine Learning Engineer, Data Scientist, AI Researcher
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.