6 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
TOON is a compact format designed for encoding JSON data, making it easier for large language models to process. It combines YAML's structure with a CSV-like layout to reduce token usage while maintaining accuracy. While effective for uniform arrays, it's less suitable for deeply nested data.
If you do, here's more
Token-Oriented Object Notation (TOON) is a new format designed to represent JSON data in a more compact and human-readable way, particularly for use with Large Language Models (LLMs). It combines features of YAML and CSV, providing a structure that minimizes token usage while maintaining clarity. TOON is best suited for uniform arrays of objects, achieving a 74% accuracy across various benchmarks while using about 40% fewer tokens than standard JSON. This is significant given the rising costs associated with LLM tokens, where reducing token size can lead to cost savings.
The format uses indentation for hierarchical data and a tabular layout for arrays, which allows it to represent data efficiently. For example, where JSON and YAML might require more tokens to describe similar data, TOON condenses it into fewer lines. The benchmarks indicate that while TOON excels in many scenarios, it isn't ideal for deeply nested structures or purely tabular data. In those cases, traditional JSON or CSV might be more effective. Specific tests showed that TOON had a higher accuracy rate than JSON, especially in mixed-structure datasets.
TOON is still evolving, and contributors are encouraged to provide feedback and help shape its development. The format is stable enough for practical use, but its design allows for adjustments based on user input. The article includes detailed benchmarks comparing TOON with JSON, YAML, and CSV across various datasets, making it clear where each format performs best. For developers, understanding TOON's strengths and limitations can inform decisions on data representation, particularly when working with LLMs that require efficient input formats.
Questions about this article
No questions yet.