Quit Emailing Yourself

Effective harnesses for long-running agents

7 min read | Saved February 14, 2026 | Copied!

ai 🤖 agents 🤖 coding 🤖 environment-management 🤖 testing 🤖

Do you care about this?

This article discusses challenges faced by AI agents when performing long tasks across multiple sessions without memory. It introduces a two-part solution using initializer and coding agents to ensure consistent progress, effective environment setup, and structured updates to maintain project integrity.

If you do, here's more

AI agents are increasingly tasked with complex, long-running projects, but they struggle with continuity between sessions. Each session starts fresh, without memory of prior work, making it hard for agents to maintain consistent progress. The Claude Agent SDK addresses this issue through a two-part solution: an initializer agent sets up the initial environment, while a coding agent focuses on incremental progress and leaves clear documentation for the next session.

The initializer agent creates essential files like an init.sh script and a progress log (claude-progress.txt) to track development. It also compiles a comprehensive list of features, marking them all as “failing” initially to guide future work. For instance, in a project to clone claude.ai, over 200 features were outlined, ensuring coding agents had a clear roadmap. Coding agents then work on one feature at a time, committing their changes to git with detailed messages and maintaining a clean code state, which reduces confusion in future sessions.

Testing remains a critical area of improvement. Without explicit prompts, Claude often marked features complete without thorough end-to-end testing. By integrating browser automation tools and instructing the model to test as a human would, the agents could better identify and fix bugs. Still, limitations in Claude’s capabilities, like the inability to see certain browser alerts, pose ongoing challenges.

Questions about this article

No questions yet.