BETA In open beta. Install live. Lock $5/mo for your first 12 months. See pricing →
Multi-agent orchestration · stable

Coordinate many agents, across many models.

Orchestration is a single-writer coordinator, run by the local Hydrate daemon, that drives a fleet of worker agents through a shared, typed blackboard. One interactive session authors the work and holds the human gates. Workers never message each other and never touch shared state directly: they communicate only through structured artifacts (plans, patches, reviews, verdicts) and through git. That structure is what turns a pile of agents into a process you can actually trust.

Why not just let the agents talk?

Most "multi-agent" tools let agents chat to each other and hope coordination emerges. It doesn't: they lose track of who did what, and no agent can trust work it didn't watch happen. Hydrate takes a different route. The daemon owns all state, spawns the workers, enforces the rules, and drives the fleet to convergence. It rides on assets only Hydrate has: a shared cross-runtime memory layer, runtime hooks with a real completion signal, a local daemon, and a headless agent spawner already in the tree.

Three modes

Converge a document. A codebase. Or an image.

Design mode proven

Converge a document through adversarial critique.

Design mode takes a written design, a spec, an RFC or an architecture proposal, and converges it through rounds of adversarial critique.

  1. You start a session against a draft.
  2. Each round dispatches a critic from a different model family, which reads the draft plus the running objection register and files structured objections.
  3. Objections are tracked by content, not by who raised them, so the same concern cannot be relitigated under a new label. A contested objection raised unchanged twice is escalated for you to rule on, rather than looping forever.
  4. You rule on each material objection (accept or contest); the author revises; the next round validates that resolved objections stay resolved.
  5. When zero material objections remain and the trend is converging, the session moves to sign-off: your final human gate.

Hard round cap: 8. Output: a finalised design plus a full decision log of every objection and how it was resolved.

Develop mode live

Parallel implementation across projects.

Develop mode takes a set of work units, across one or several repositories, and runs them to verified, reviewed, integration-ready code.

  1. You define targets (project, base branch, test command) and a set of work units.
  2. Each unit runs in its own isolated git worktree, so parallel implementers never collide.
  3. Per unit the pipeline is implement, review, judge. The implementer writes the patch; a reviewer from a different model family reads it in a read-only worktree and files objections; a judge scores it against a five-point rubric: plan adherence, every verification step green, scope containment, no regressions, no unresolved review objections.
  4. A failing review or verdict requeues the unit for another attempt, bounded by a per-unit cap so nothing loops indefinitely. A unit that cannot pass is handed to you, not silently dropped.
  5. When the units are verified, you open the integration gate. Verified units merge into per-target integration branches: never your main branch, never your working checkout. A senior audit pass then re-runs the tests and checks for regressions before the session reports done.

Output: per-target integration branches ready for your normal pull-request flow. The mode deliberately stops short of main; the merge decision stays yours.

Image mode new

Generate images with a critic in the loop.

Image mode takes an image brief and runs it as a real choreography. The same cross-family check that guards code now guards pixels.

  1. You (the interactive head) author the image spec.
  2. A Codex generator renders the image into an output folder.
  3. A vision judge, a different model from the one that drew it, scores the result against your spec on prompt match, constraints, and visual artifacts.
  4. A passing image lands for your approval. A failing one is regenerated automatically up to a cap, then handed to you: accept, regenerate, or abandon.

Output: a generated image plus the spec, the generation, and the judge's per-criterion verdict, all recorded. The judge being strict is a feature: in our own smoke test it correctly rejected a gradient-shaded circle when the spec asked for flat vector, and the pipeline regenerated rather than shipping it.

Proof from inside the build: Develop mode's own specification was converged by Design mode. A real Codex critic, eight rounds, fourteen objections, human sign-off.

The fleet

Cross-family review is the moat.

The deliberate move is mixing model families by role: the agent that judges the work is not the same family as the one that wrote it, so each catches the other's blind spots. By default Claude implements and Codex reviews and judges, with Fable standing in when Codex is unavailable.

Design mode

RoleWhoRuns
AuthorYou (Claude session)interactive
CriticCodexone per round, sequential
ArbiterYousign-off and escalation gate

Develop mode

RoleWhoRuns
OrchestratorYou (Claude session)interactive
ImplementerClaude (Sonnet)one per unit, in parallel
ReviewerCodexper unit, read-only worktree
JudgeCodexper unit, scores the rubric
AuditClaude (Opus)per target, after integration
ArbiterYouintegration and override gates

Image mode

RoleWhoRuns
OrchestratorYou (Claude session)authors the image specs
GeneratorCodexrenders the image into the output dir
JudgeCodex vision (independent of the generator)scores the image against the spec
ArbiterYouaccept / regenerate / abandon gate

Parallelism is bounded by a spawn cap (two concurrent workers by default, ceiling four), so a fleet never runs away with your machine. Workers are hydrated with the relevant project's memory but walled off from every other target's context, so cross-project secrets never leak between units.

This is what Hydrate's shared memory makes possible: a Claude implementer and a Codex reviewer can work the same task, with the same context, because the memory layer underneath them is common ground. The cross-family review is the moat, and it only works when the runtimes share a memory.

Versus the field

A group chat with extra steps.

Most multi-agent tooling is a group chat with extra steps. Hydrate treats coordination as infrastructure: a single coordinator, a shared memory, and hard rules. That is what makes the output trustworthy instead of merely plausible.

Dimension Generic multi-agent frameworks Hydrate orchestration
Coordination model Agents message each other and hope alignment emerges A single-writer coordinator drives workers through a typed blackboard; no agent-to-agent chat
State ownership Shared and implicit; prone to races and lost updates The local daemon is the sole writer; every transition is compare-and-set and idempotent
Model diversity Usually one model family across all roles Cross-family by design: Claude implements, Codex reviews and judges
Verification Output trusted as-is, or the same model reviews itself Independent review from a different family, then a judge scores against a fixed rubric before anything passes
Shared memory Per-agent context or ephemeral scratchpads One cross-runtime memory layer, so every worker shares the same ground truth
Workspace isolation Shared working directory; agents collide Each unit runs in its own hermetic git worktree
Safety to your repo Agents often write straight to your tree Verified work lands on integration branches only; never touches main, never your checkout
Termination guarantees Loops and runaway fleets are common Hard round caps, lease timeouts, and a spawn cap bound every run
Human control Fire-and-forget; you read the wreckage after Explicit human gates (design sign-off, integration approval, override) at the decision points
Failure handling Silent drift; partial observability of who did what A unit that can't pass escalates to you; nothing is silently dropped

The comparison is against the general pattern, not any single named framework.

Image mode vs generic image tooling

Dimension Generic image tooling Hydrate Image mode
Quality control You eyeball the output yourself An independent vision judge scores every image against the spec before it ships
Failure handling Regenerate by hand and re-check Auto-regenerates on a failed verdict up to a cap, then escalates to you
Provenance A prompt and a file The spec, the generation, and the judge's per-criterion verdict are all recorded

The same engine that runs Design and Develop runs Image: author a spec, a generator renders it, and a different model judges the result against the spec. A bad image gets caught and regenerated, not shipped.