A multi-agent coding workflow

One agent can't
hold it all.
So we stopped
asking it to.

A workflow built on three honest tools — Solo to orchestrate, Claude Code to build, Mulch to remember. Here's how it actually works. (This site was built with it.)

solo orchestrates claude code executes mulch remembers
Scroll to see the loop

01 — The problem

The longer an agent works,
the worse it gets.

Pile every file, every decision, and every dead end into one context window and quality erodes — stale assumptions stick, attention thins, and yesterday's hard-won insight is gone by tomorrow. Bigger prompts don't fix this. Better structure does.

02 — Meet the cast

Three tools, one job each.

Clean separation of concerns, all the way down. No tool reaches into another's lane.

orchestrates

Solo

The conductor. An MCP server that lets one Claude plan and delegate but never touch a file. It writes the brief and keeps everyone in time.

executes

Claude Code

The workers. Each task runs as a fresh Claude Code agent that reads and writes the actual code — carries one ticket, then powers down.

remembers

Mulch

The memory. A git-tracked store of expertise the workers read before they start and write to when they finish. A CLI — no server, no LLM.

Solo coordinates · Claude Code executes · Mulch remembers

03 — The workflow loop

Four phases. A fresh worker for each.

Every task runs the same loop: research the ground truth, plan the approach, implement it, then hand it to an independent QA worker. No phase inherits another's baggage — each starts clean with exactly the context it needs.

04 — How it actually works

The orchestrator
stays at altitude.

Solo's orchestrator never edits code. It writes the brief into a to‑do, keeps shared context in a scratchpad, spawns a fresh agent to do the work, then sleeps on an idle timer until that agent reports back — no busy-waiting.

One agent, one task, never recycled, so no stale context bias leaks across the build.

05 — Memory that compounds

Workers that learn —
and don't forget.

Before a worker starts, ml prime and ml search pull the expertise relevant to the task. When it finishes, it calls ml record to write down what it learned — a convention, a pattern, a failure to avoid. It all lives in git, so the next agent (and your teammates) start smarter.

Recording alone is a junk drawer; priming on the way in is what turns it into expertise that's actually used.

  • convention
  • pattern
  • failure
  • decision
  • reference
  • guide

Tiers foundational / tactical / observational govern shelf-life; ml search ranks with BM25 and ml rank floats proven expertise up by how often it's been confirmed. Append-only JSONL, git-merged without conflicts.

06 — Why it's better

Smaller context, sharper output.

Tight context

Scoped tasks keep every agent's window small and on-point.

Parallel work

Independent tasks run side by side instead of in one queue.

Compounding expertise

Learnings accrue in Mulch instead of evaporating between sessions.

Repeatable

A coordinator that only coordinates can run the loop again without drifting.

07 — How quality stays high

Quality isn't a vibe.
It's the pipeline.

An independent QA worker checks the build with fresh eyes — it never wrote the code it's reviewing. Bugs get root-caused before they're fixed, not patched blind. And every fix that sticks gets recorded, so the same mistake doesn't come back.

Quality is a property of the process, not a final polish.

08 — The proof

We didn't just describe it.
We shipped this with it.

This page was researched, planned, built, and QA'd by exactly the loop above: an orchestrator dispatching fresh Solo agents, each one a Claude Code instance, learnings recorded in Mulch. The proof is the thing you're reading.