A Close Look At World Model Recovery In Supervised Fine-Tuned LLM Planners

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A recent study investigates how supervised fine-tuning (SFT) impacts large language models' (LLMs) ability to recover and represent "world models" for classical planning problems. Researchers devised interpretability experiments to examine both internal representations and generative capabilities of SFT-tuned LLMs. They found that fine-tuning on valid action sequences allows LLMs to linearly encode action validity and certain state predicates. Interestingly, models that struggle to classify action validity using output probabilities can still develop internal representations distinguishing valid from invalid actions. Furthermore, achieving broader state space coverage during fine-tuning, such as through random walk data, significantly enhances the accuracy of the underlying world model's recovery. This work also contributes a methodology for applying interpretability techniques specifically to planning LLMs.

Key takeaway

For Machine Learning Engineers developing LLM planners, understanding internal world model recovery is crucial. You should prioritize fine-tuning with diverse, valid action sequences and consider incorporating random walk data to achieve broader state space coverage. This approach enhances the LLM's ability to accurately represent and reason about planning problems, even if direct output probabilities are ambiguous. Focus on analyzing internal representations for deeper insights into action validity.

Key insights

SFT enables LLMs to internally represent planning world models, with broader training data improving recovery.

Principles

Method

A series of interpretability experiments were devised to holistically interrogate world model recovery by examining internal representations and generative capabilities of fine-tuned LLMs.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.