Quit Emailing Yourself

The Code-Only Agent

7 min read | Saved February 14, 2026 | Copied!

code-only 🤖 agents 🤖 execution 🤖 programming 🤖 automation 🤖

Do you care about this?

This article explores the concept of a Code-Only agent that uses a single tool—code execution—to perform tasks. By enforcing this limitation, the agent generates executable code for all operations, shifting focus from tool selection to code production, which enhances reliability and clarity in computing tasks.

If you do, here's more

The Code-Only agent paradigm simplifies how agents operate by focusing solely on executing code as the primary tool. Instead of relying on typical command-line utilities like `bash`, `ls`, or `grep`, the agent only uses a single capability: executing code. This shift transforms the interaction model from asking the agent to perform tasks using various tools to generating and running code to achieve the desired outcome. For example, if the agent needs to find a file, it creates and runs Python code to do so, rather than executing a series of tool commands.

This approach emphasizes the concept of a "code witness," where the agent's output is not just a natural language response but a piece of executable code that directly represents the work done. The code has defined semantics governed by the programming language used, ensuring that it can be reliably executed and understood. The article argues that traditional agents often skip steps or produce unreliable results, while a Code-Only agent must write code to accomplish tasks, enforcing a level of thoroughness that leads to more reliable outcomes.

Implementing a Code-Only agent involves several design considerations. The output handling must account for the size of the results, with options to return data directly or save it to a file for larger outputs. Managing how the agent interacts with its language runtime—like Python or TypeScript—is crucial for effective execution. The article notes that while this paradigm might seem extreme, especially in domains that are inherently computable, it lays a foundation for creating more trustworthy systems that provide guarantees about their outputs. This shift could lead to new ways of composing code as reusable blocks, enhancing the agent's capabilities further.

Questions about this article

No questions yet.