Embracing the parallel coding agent lifestyle

5th October 2025

For a while now I’ve been hearing from engineers who run multiple coding agents at once—firing up several Claude Code or Codex CLI instances at the same time, sometimes in the same repo, sometimes against multiple checkouts or git worktrees.

I was pretty skeptical about this at first. AI-generated code needs to be reviewed, which means the natural bottleneck on all of this is how fast I can review the results. It’s tough keeping up with just a single LLM given how fast they can churn things out, where’s the benefit from running more than one at a time if it just leaves me further behind?

Despite my misgivings, over the past few weeks I’ve noticed myself quietly starting to embrace the parallel coding agent lifestyle.

I can only focus on reviewing and landing one significant change at a time, but I’m finding an increasing number of tasks that can still be fired off in parallel without adding too much cognitive overhead to my primary work.

Here are some patterns I’ve found for applying parallel agents effectively.

Research for proof of concepts

The first category of tasks I’ve been applying this pattern to is research.

Research tasks answer questions or provide recommendations without making modifications to a project that you plan to keep.

A lot of software projects start with a proof of concept. Can Yjs be used to implement a simple collaborative note writing tool with a Python backend? The libraries exist, but do they work when you wire them together?

Today’s coding agents can build a proof of concept with new libraries and resolve those kinds of basic questions. Libraries too new to be in the training data? Doesn’t matter: tell them to checkout the repos for those new dependencies and read the code to figure out how to use them.

How does that work again?

If you need a reminder about how a portion of your existing system works, modern “reasoning” LLMs can provide a detailed, actionable answer in just a minute or two.

It doesn’t matter how large your codebase is: coding agents are extremely effective with tools like grep and can follow codepaths through dozens of different files if they need to.

Ask them to make notes on where your signed cookies are set and read, or how your application uses subprocesses and threads, or which aspects of your JSON API aren’t yet covered by your documentation.

These LLM-generated explanations are worth stashing away somewhere, because they can make excellent context to paste into further prompts in the future.

Small maintenance tasks

Now we’re moving on to code edits that we intend to keep, albeit with very low-stakes. It turns out there are a lot of problems that really just require a little bit of extra cognitive overhead which can be outsourced to a bot.

Warnings are a great example. Is your test suite spitting out a warning that something you are using is deprecated? Chuck that at a bot—tell it to run the test suite and figure out how to fix the warning. No need to take a break from what you’re doing to resolve minor irritations like that.

There is a definite knack to spotting opportunities like this. As always, the best way to develop that instinct is to try things—any small maintenance task is something that’s worth trying with a coding agent. You can learn from both their successes and their failures.

Carefully specified and directed actual work

Reviewing code that lands on your desk out of nowhere is a lot of work. First you have to derive the goals of the new implementation: what’s it trying to achieve? Is this something the project needs? Is the approach taken the best for this current project, given other future planned changes? A lot of big questions before you can even start digging into the details of the code.

Code that started from your own specification is a lot less effort to review. If you already decided what to solve, picked the approach and worked out a detailed specification for the work itself, confirming it was built to your needs can take a lot less time.

I described my more authoritarian approach to prompting models for code back in March. If I tell them exactly how to build something the work needed to review the resulting changes is a whole lot less taxing.

How I’m using these tools today

My daily drivers are currently Claude Code (on Sonnet 4.5), Codex CLI (on GPT-5-Codex), and Codex Cloud (for asynchronous tasks, frequently launched from my phone.)

I’m also dabbling with GitHub Copilot Coding Agent (the agent baked into the GitHub.com web interface in various places) and Google Jules, Google’s currently-free alternative to Codex Cloud.

I’m still settling into patterns that work for me. I imagine I’ll be iterating on my processes for a long time to come, especially as the landscape of coding agents continues to evolve.

I frequently have multiple terminal windows open running different coding agents in different directories. These are currently a mixture of Claude Code and Codex CLI, running in YOLO mode (no approvals) for tasks where I’m confident malicious instructions can’t sneak into the context.

(I need to start habitually running my local agents in Docker containers to further limit the blast radius if something goes wrong.)

I haven’t adopted git worktrees yet: if I want to run two agents in isolation against the same repo I do a fresh checkout, often into /tmp.

For riskier tasks I’m currently using asynchronous coding agents—usually Codex Cloud—so if anything goes wrong the worst that can happen is my source code getting leaked (since I allow it to have network access while running). Most of what I work on is open source anyway so that’s not a big concern for me.

I occasionally use GitHub Codespaces to run VS Code’s agent mode, which is surprisingly effective and runs directly in my browser. This is particularly great for workshops and demos since it works for anyone with GitHub account, no extra API key necessary.

This category of coding agent software is still really new, and the models have only really got good enough to drive them effectively in the past few months—Claude 4 and GPT-5 in particular.

I plan to write more as I figure out the ways of using them that are most effective. I encourage other practitioners to do the same!

Simon Willison’s Weblog