TOON: Beyond JSON for LLMs

· Source: Towards AI - Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Intermediate, quick

Summary

TOON, or Token-Oriented Object Notation, is introduced as a novel method for representing structured data more efficiently for Large Language Models (LLMs). While JSON remains the standard for traditional application-to-application communications due to its robustness and ease of use, its verbose syntax incurs an "invisible price tag" when processed by LLMs. Every quotation mark, comma, brace, bracket, and repeated key consumes valuable tokens, depleting the LLM's context window, especially when handling large data like API responses, search results, or RAG chunks. TOON aims to provide a token-efficient alternative specifically for data intended for LLM consumption, without seeking to replace JSON in broader enterprise systems. This approach addresses the cost and context limitations associated with JSON's tokenization by LLMs.

Key takeaway

For AI Engineers optimizing LLM application performance and cost, consider TOON for structured data inputs. If you are passing large API responses, search results, or RAG chunks to an LLM, adopting TOON can significantly reduce token consumption and preserve valuable context window space. Evaluate TOON's potential to enhance efficiency in your LLM workflows, especially where JSON's verbosity currently incurs high token costs.

Key insights

TOON offers a token-efficient structured data format specifically designed to optimize LLM context window usage over traditional JSON.

Principles

In practice

Topics

Best for: NLP Engineer, AI Engineer, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards AI - Medium.