Trained transformer-based chess models to play like humans (including thinking time) [P]

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Advanced, short

Summary

A set of transformer-based deep learning models has been trained to emulate human chess play, including decision-making time. These models are segmented by 100-point rating buckets, ranging from approximately 800 to 2500+ ELO. Initial training for a mid-strength model utilized an 8xH100 cluster, followed by fine-tuning for other rating ranges on a local 5090 GPU. The training dataset comprised nearly a year of Lichess data, totaling about 1 billion games. Each rating range includes three distinct models: a move model, a thinking time model, and a win/draw/loss prediction model. Despite their small size (9 million parameters), the move models achieve accuracy comparable to MAIA-3. The models also incorporate player ratings and clock times, influencing blunder rates under time pressure and win probabilities.

Key takeaway

For machine learning engineers developing human-like AI agents, consider segmenting your model by skill level and incorporating auxiliary models for non-action behaviors like thinking time. Your training pipeline should prioritize efficient data loading, such as pre-shuffling datasets and sequential reading, to prevent I/O bottlenecks and maximize GPU utilization, especially with large datasets like 1 billion games.

Key insights

Transformer models can emulate human chess play and thinking times using large-scale Lichess data.

Principles

Method

Train separate transformer models for move prediction, thinking time, and win probability, conditioned on player ratings and clock times, using a pre-shuffled, sequentially read dataset to maximize GPU utilization.

In practice

Topics

Code references

Best for: AI Scientist, Machine Learning Engineer, AI Student

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.