LLM 0.32a0 is a major backwards-compatible refactor
Summary
LLM 0.32a0, an alpha release of the LLM Python library and CLI tool, introduces significant backwards-compatible changes to better handle the diverse input and output types of modern large language models. Released on April 29, 2026, this update refactors the library's core to move beyond a simple text-in, text-out model. Key changes include representing model inputs as a sequence of messages, aligning with conversational APIs like OpenAI's chat completions, and modeling model responses as a stream of differently typed parts, accommodating mixed content such as text, tool calls, and even multi-modal outputs like images or audio. The update also provides a new mechanism for serializing and deserializing responses as JSON-style dictionaries, offering greater flexibility for Python API users.
Key takeaway
For NLP Engineers building conversational or multi-modal LLM applications, LLM 0.32a0 simplifies handling complex interactions. You can now directly feed entire conversation histories as message sequences and process rich, streaming outputs that interleave text, tool calls, and other data types. This update reduces boilerplate for integrating advanced LLM features and offers a flexible serialization mechanism for custom storage solutions.
Key insights
LLM 0.32a0 refactors input to message sequences and output to typed streams, enhancing multi-modal and tool-use capabilities.
Principles
- LLM APIs should reflect conversational turns.
- Model outputs can be a stream of mixed types.
Method
Model inputs are now sequences of `user()` and `assistant()` messages. Responses stream as `event.type` parts (text, tool_call_name, tool_call_args) for granular processing and display.
In practice
- Use `model.prompt(messages=[user("..."), assistant("...")])` for conversational input.
- Iterate `response.stream_events()` to process mixed output types.
- Employ `response.to_dict()` for custom response persistence.
Topics
- LLM 0.32a0 Refactor
- Python LLM Library
- Message-based Prompts
- Multi-modal Output Streaming
- LLM Tool Calling
Code references
Best for: NLP Engineer, AI Engineer, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.