OpenClaw Multi-Agent Setup: Run Specialized Agents in Parallel

Here’s what my OpenClaw setup looked like six months ago: one agent doing everything. Writing emails, searching the web, reviewing code, summarizing documents — all the same model, all in the same conversation context.

It worked, but it was slow, expensive, and wrong in subtle ways. A research-heavy task would accumulate 80,000 tokens of context. My coding agent would “remember” an email conversation from two days ago. Simple tasks waited behind complex ones in the queue.

Multi-agent setup fixes all of this. You run specialized agents in parallel — each one scoped to a specific job, isolated from the others, using exactly the model it needs.

This guide covers everything from the basic concepts to real-world architectures I’ve built and tested. By the end, you’ll have a working multi-agent setup that handles more work, costs less, and produces cleaner results.

I run my OpenClaw on xCloud — a managed hosting platform that handles the server side for you. All the configuration in this guide works regardless of where you host.

What Is a Multi-Agent Setup?
Single Agent vs. Multi-Agent: What Changes
The Four Agent Roles in OpenClaw
Step-by-Step: Your First Multi-Agent Config
Orchestration Patterns
Session Isolation: Why It Matters
Three Real-World Architectures
Monitoring Multiple Agents
Common Mistakes and How to Fix Them
FAQ

What Is a Multi-Agent Setup?

A multi-agent setup runs several specialized AI agents from a single OpenClaw instance. Each agent has a defined role, its own model, isolated context, and specific tools. A main orchestrator routes tasks to the right agent and assembles the results.

A single-agent setup is one LLM trying to do everything in one context window. A multi-agent setup is more like a small team: a researcher, a writer, a coder, and a planner — each handling their lane, working in parallel, and handing off results.

The main orchestrator is the only agent users interact with directly. It decides whether to answer inline or delegate to a specialist. Sub-agents receive tasks, complete them, and return results to the orchestrator. The user sees the final output; the internal coordination happens transparently.

Single Agent vs. Multi-Agent: What Changes

Here’s a concrete comparison of the same workflow in both setups:

Task: Research a topic, write a summary, save it to a file, and send it via email.

Step	Single Agent	Multi-Agent
Research	Orchestrator does it	Research agent does it
Write summary	Same context, same model	Writing agent, isolated context
Save to file	Same agent	File agent handles it
Send email	Same agent	Communication agent sends it
Context after task	80,000+ tokens accumulated	Each agent has ~5,000-15,000 tokens
Total API cost	$0.40-0.80	$0.08-0.20
Can tasks run in parallel?	No	Yes

The cost difference comes from context isolation. In a single-agent setup, every step of the task adds to a growing conversation history. By step 4, the model is sending all of steps 1-3 as context again. In a multi-agent setup, each agent starts fresh with only what it needs.

The Four Agent Roles in OpenClaw

OpenClaw recognizes four agent roles. Understanding them before writing your config saves a lot of rework.

1. Orchestrator (Main Agent)

The primary agent — the one that receives user messages and decides what to do. It’s the only agent with access to your full conversation history. Keep this one lean: its job is routing and assembling results, not doing deep domain work.

Best model: Claude Sonnet — smart enough to reason about delegation, not wastefully expensive.

2. Sub-Agent (Specialist)

Spawned by the orchestrator to complete a specific task. Each sub-agent gets a scoped set of tools and a narrow system prompt. It runs, returns its result, and terminates. Sub-agents can run in parallel.

Best model: Depends on the task. Use Haiku for simple formatting or lookups. Use Sonnet for code review or complex summarization. Use Opus only when the task requires deep multi-step reasoning.

3. Background Agent (Cron)

Runs on a schedule without user interaction. Common uses: heartbeat checks, daily summaries, inbox monitoring, news digests. These agents should always use isolated sessions and cheap models — they’re high-frequency and low-stakes.

Best model: google/gemini-3-flash-preview for heartbeats and lightweight digests. Haiku for substantive background work.

4. Tool Agent (Micro-Agent)

The smallest unit — a focused agent that wraps a single tool or API. Think: a “search agent” that just runs web searches and returns structured results, or a “calendar agent” that only reads and writes calendar events. These are called by other agents, never directly by users.

Best model: Haiku or GPT-4o-mini. Tool agents should be fast and cheap.

Step-by-Step: Your First Multi-Agent Config

Here’s how to go from a single-agent setup to a working multi-agent config. I’ll build this incrementally so each step is testable.

Step 1: Audit Your Current Usage

Before adding agents, find out what your single agent currently spends its time on. Run OpenClaw for a week and review the session history in ~/.openclaw/agents/<agentId>/sessions/ to identify task patterns. You’ll likely see a breakdown like:

42% — web search and research
28% — coding tasks
19% — writing and formatting
11% — file operations and admin

These percentages tell you what specialist agents to build first. Build agents for your top 2-3 task types.

Step 2: Define Your Agent Roster

Start with a minimal roster. Add agents only when you have clear evidence they’re needed.

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-sonnet-4-6"
      }
    },
    "list": [
      {
        "id": "orchestrator",
        "name": "Orchestrator",
        "workspace": "~/.openclaw/workspace-orchestrator",
        "agentDir": "~/.openclaw/agents/orchestrator",
        "default": true
      },
      {
        "id": "research",
        "name": "Research",
        "workspace": "~/.openclaw/workspace-research",
        "agentDir": "~/.openclaw/agents/research",
        "model": "anthropic/claude-haiku-4-5-20251001",
        "tools": {
          "allow": ["web_search", "web_fetch"]
        }
      },
      {
        "id": "coder",
        "name": "Coder",
        "workspace": "~/.openclaw/workspace-coder",
        "agentDir": "~/.openclaw/agents/coder",
        "tools": {
          "allow": ["read", "write", "exec"]
        }
      },
      {
        "id": "writer",
        "name": "Writer",
        "workspace": "~/.openclaw/workspace-writer",
        "agentDir": "~/.openclaw/agents/writer",
        "model": "anthropic/claude-haiku-4-5-20251001",
        "tools": {
          "allow": ["read", "write"]
        }
      }
    ]
  }
}

Key things to notice:

Each agent gets its own workspace and agentDir — this is how OpenClaw achieves session isolation between agents
Each agent’s tools.allow list is scoped — the writer can’t run commands, the coder can’t browse the web
System prompts are defined in each agent’s SOUL.md workspace file, not in the JSON config (see OpenClaw SOUL.md: The Complete Guide for how to write effective agent personalities, or browse the SOUL.md examples catalog for ready-to-use templates)
The orchestrator is marked "default": true so it receives unrouted messages

Step 3: Set Up Parallel Execution

For interactive sub-agent orchestration, set maxConcurrent in agents.defaults — this controls how many sub-agents the orchestrator can run simultaneously:

{
  "agents": {
    "defaults": {
      "maxConcurrent": 3,
      "timeoutSeconds": 120
    }
  }
}

For background cron jobs, set maxConcurrentRuns in the cron block:

{
  "cron": {
    "enabled": true,
    "maxConcurrentRuns": 3,
    "sessionRetention": "24h"
  }
}

A “research, then write, then format” pipeline that took 90 seconds sequentially now takes 30 seconds when the research and formatting steps run concurrently.

Step 4: Add Background Agents

Background agents run on schedules. You create cron jobs via the CLI with openclaw cron add:

# Morning digest — runs at 8am daily using the research agent
openclaw cron add \
  --name "morning_digest" \
  --cron "0 8 * * *" \
  --session isolated \
  --agent research \
  --message "Summarize the top 5 news items relevant to AI development and software engineering from the last 24 hours. Format as a bullet list with one sentence per item." \
  --model "google/gemini-3-flash-preview" \
  --announce --channel telegram --to "YOUR_TELEGRAM_CHAT_ID"

# Heartbeat — runs every 30 minutes to keep the instance warm
openclaw cron add \
  --name "heartbeat" \
  --every "30m" \
  --session isolated \
  --message "Reply with OK." \
  --model "google/gemini-3-flash-preview"

The morning digest uses --agent research to route the job to your research agent. The --session isolated flag ensures each run starts with a clean context — no memory of previous runs. The heartbeat is pure infrastructure using a free model and a trivial prompt.

Step 5: Test Each Agent in Isolation

Before running the full multi-agent setup, test each specialist agent individually by sending it a message directly:

openclaw agent --agent research -m "What is the latest Claude model?"
openclaw agent --agent coder -m "Write a Python function that calculates Fibonacci numbers"
openclaw agent --agent writer -m "Summarize this in 3 sentences: [paste text]"

Fix any agent that returns wrong, unhelpful, or out-of-scope results before adding it to the orchestrator flow. It’s much easier to debug in isolation.

Orchestration Patterns

Once you have agents defined, you need to decide how they coordinate. There are three patterns I’ve found useful.

Pattern 1: Sequential Pipeline

Each agent’s output becomes the next agent’s input. Use this for tasks with clear stages.

User → Orchestrator → Research Agent → Writer Agent → Orchestrator → User

In practice, the orchestrator handles this sequencing in its SOUL.md persona file. You instruct it: “For research + writing tasks: first send the topic to the research agent, then pass its output to the writer agent, then return the final draft.” The orchestrator follows this as part of its reasoning, using the sessions_spawn tool or sessions_send tool under the hood to dispatch tasks to specialist agents.

Pattern 2: Fan-Out (Parallel Analysis)

One task gets split across multiple agents simultaneously. Use this when you need multiple perspectives or parallel independent analyses.

Orchestrator → [Research Agent, Coder Agent, Writer Agent] → Aggregate → User

Example: analyzing a GitHub PR. Research agent checks related issues, coder agent reviews the diff, writer agent drafts the review comment. All three are dispatched by the orchestrator in parallel using the sessions_spawn mechanism, and the orchestrator assembles the final output once all three respond.

Pattern 3: Recursive Delegation

Agents spawn sub-agents of their own. Use this carefully — unbounded recursion is expensive and hard to debug.

A safe pattern: restrict which agents can spawn sub-agents. OpenClaw controls this per-agent via subagents.allowAgents. By default leaf agents should have an empty allowlist, preventing them from spawning further:

{
  "agents": {
    "defaults": {
      "subagents": {
        "allowAgents": ["*"]
      }
    },
    "list": [
      {
        "id": "research",
        "subagents": {
          "allowAgents": []
        }
      },
      {
        "id": "coder",
        "subagents": {
          "allowAgents": []
        }
      }
    ]
  }
}

The orchestrator (defaults) can spawn any agent ("*"). Leaf agents like research and coder have an empty allowAgents, preventing runaway cascades.

Session Isolation: Why It Matters

Session isolation means each sub-agent starts with a clean context window, uses only the tools and memory it needs for its task, and closes when it’s done. Without isolation, sub-agents inherit the full conversation history and compound your costs.

Here’s what happens without isolation:

Orchestrator has 30,000 tokens of conversation history
Orchestrator spawns research agent — the research agent inherits all 30,000 tokens
Research agent completes its work (adds 10,000 more tokens)
Orchestrator spawns writer agent — the writer agent inherits all 40,000 tokens

The writer agent has 40,000 tokens of context it doesn’t need and will never use. You’re paying for all of it.

With each agent in its own isolated workspace and agentDir:

Research agent gets: its own SOUL.md context (2,000 tokens) + task prompt (500 tokens) = 2,500 tokens
Writer agent gets: its own SOUL.md context (1,500 tokens) + research output (3,000 tokens) = 4,500 tokens

That’s a 6x reduction in context tokens for these agents alone.

Even with isolation, agents sometimes need shared context. OpenClaw handles this through workspace files — each agent’s workspace directory can contain shared files like AGENTS.md and USER.md that get loaded as context on every turn. Point multiple agents at files in a common directory:

{
  "agents": {
    "list": [
      {
        "id": "research",
        "workspace": "~/.openclaw/workspace-shared"
      },
      {
        "id": "writer",
        "workspace": "~/.openclaw/workspace-shared"
      }
    ]
  }
}

Or use separate workspaces and manually maintain a shared AGENTS.md file with facts all agents need. Keep shared context under 1,000 tokens — it’s loaded on every call.

Three Real-World Architectures

Here are three multi-agent setups I use daily.

Architecture 1: Personal Assistant

Four agents covering my main personal tasks:

orchestrator (Sonnet)
├── research (Haiku) — web search and reading
├── writer (Haiku) — drafts, summaries, formatting
├── file_manager (Haiku) — read/write files, organize
└── communicator (Haiku) — email and calendar

Background:
├── morning_digest (gemini-3-flash-preview) — 8am daily
└── heartbeat (gemini-3-flash-preview) — every 30 minutes

Monthly cost: ~$18-25 in API fees.

Architecture 2: Developer Workflow

Built for code-heavy work with deeper reasoning where it counts:

orchestrator (Sonnet)
├── planner (Sonnet) — architecture decisions, task breakdown
├── coder (Sonnet) — reads and writes code, runs tests
├── reviewer (Opus, on-demand) — only invoked for final PR review
├── debugger (Sonnet) — reads errors, searches docs, proposes fixes
└── documenter (Haiku) — writes docstrings, README sections, changelogs

The key design choice: Opus is invoked on-demand for PR reviews only — not as the default coder. This gives you the benefit of Opus’s deep reasoning at the point where it matters most (catching bugs before merge) without paying for it on routine tasks.

Monthly cost: ~$45-70 in API fees depending on PR volume.

Architecture 3: Content Pipeline

Built for high-volume content production:

orchestrator (Sonnet)
├── researcher (Haiku) — topic research, fact-checking
├── outliner (Haiku) — creates structured outlines from research
├── writer (Sonnet) — drafts sections from outlines (needs quality)
├── editor (Haiku) — grammar, clarity, consistency checks
└── formatter (Haiku) — markdown formatting, frontmatter, TOC generation

Background:
└── trend_monitor (gemini-3-flash-preview) — monitors RSS feeds, flags content opportunities

This pipeline runs in sequence for each content piece. Parallelism is used for the research phase when multiple angles are being investigated simultaneously.

Monthly cost: ~$30-50 for moderate volume (15-20 articles/month).

Monitoring Multiple Agents

With multiple agents running, you need more than just cost tracking — you need to know which agents are working correctly and which are producing poor outputs.

Per-Agent Logging

OpenClaw writes logs to ~/.openclaw/logs/ and stores cron run history with size limits configured under cron.runLog:

{
  "cron": {
    "runLog": {
      "maxBytes": "2mb",
      "keepLines": 2000
    }
  }
}

For per-agent activity, each agent maintains its own session history under ~/.openclaw/agents/<agentId>/sessions/. You can inspect individual agent sessions to track which agents are expensive and which are fast.

Quality Monitoring

A common trap with multi-agent setups: a sub-agent silently produces garbage output, the orchestrator assembles it into a response, and you only notice weeks later that the research agent has been returning hallucinated facts.

Build validation into the orchestrator’s instructions. Tell it in SOUL.md: “If the research agent’s output doesn’t include at least one URL, discard the result and notify the user.” The orchestrator handles this in its reasoning rather than through a separate config layer.

For cron jobs, configure retries at the cron level:

{
  "cron": {
    "retry": {
      "maxAttempts": 3,
      "backoffMs": [60000, 120000, 300000],
      "retryOn": ["rate_limit", "overloaded", "network", "server_error"]
    }
  }
}

Budget Awareness Per Agent

Monitor spending per agent by reviewing each agent’s session history in ~/.openclaw/agents/<agentId>/sessions/. Since each agent has its own isolated session directory, you can see exactly which agents are accumulating the most tokens.

The most effective budget control in a multi-agent setup is model selection: assign cheap models (Haiku, free OpenRouter tiers) to high-frequency agents, and reserve expensive models for agents that actually need the capability. A research agent running 50 times a day at Haiku rates costs a fraction of the same agent on Sonnet.

Common Mistakes and How to Fix Them

Mistake 1: Too many agents too soon. Start with 2-3 agents. Every agent you add is a coordination surface that can fail. Add agents when you have clear evidence a specialist would perform better than the orchestrator handling a task directly.

Mistake 2: No session isolation. This single oversight can multiply your costs by 3-5x. Always give each sub-agent its own workspace and agentDir in the config. For cron jobs, always use --session isolated so each run starts clean.

Mistake 3: Giving every agent every tool. A writer doesn’t need exec. A researcher doesn’t need message. Narrow tool access prevents accidents and makes agent behavior more predictable.

Mistake 4: Over-engineering the orchestrator prompt. If your orchestrator system prompt is 2,000 words, it’s too complex. The orchestrator should route, not reason. Short, clear delegation rules work better than elaborate decision trees.

Mistake 5: Ignoring sub-agent failures. If a sub-agent returns an error or empty output, the orchestrator needs to know what to do. Handle this in the orchestrator’s SOUL.md instructions: “If a specialist agent returns an error or empty response, inform the user and offer to retry rather than silently returning a bad result.” For cron jobs, use the cron.retry config to automatically retry on transient errors.

FAQ

Q: How many agents can I run in parallel?
OpenClaw’s cron system defaults to maxConcurrentRuns: 1, which you can increase in your config. For interactive orchestration, practical limits depend on your server memory and API rate limits. For most setups, 3-5 concurrent agents is the sweet spot. Beyond that, you’re more likely to hit rate limits than gain speed.

Q: Can sub-agents spawn their own sub-agents?
Yes, controlled by subagents.allowAgents per agent. Set it to ["*"] to allow spawning any agent, or [] to block spawning entirely. Give leaf agents an empty list to prevent recursive chains. Going more than 2 levels deep is rarely worth the complexity.

Q: What’s the difference between a background agent and a cron agent?
They’re the same thing — OpenClaw uses the terms interchangeably. Both run on a schedule using openclaw cron add, and both should use --session isolated so each run starts with clean context.

Q: Do sub-agents remember previous conversations?
With "session": "isolated", no. Each sub-agent call is stateless. If you need a sub-agent to reference prior outputs, pass the relevant context explicitly in the task prompt.

Q: How do I debug an agent that’s producing bad output?
Send a direct message to the agent with verbose mode on: openclaw agent --agent <agentId> --verbose on -m "your test prompt". The verbose flag shows the full prompt sent to the model and the raw response. This reveals whether the problem is in the SOUL.md instructions, the injected context, or the model’s response.

Q: Can different agents use different API providers?
Yes. Mix Anthropic, OpenAI, and OpenRouter models freely using the provider/model format in each agent’s model field:

{
  "agents": {
    "list": [
      { "id": "researcher", "model": "openai/gpt-4o-mini" },
      { "id": "coder", "model": "anthropic/claude-sonnet-4-6" },
      { "id": "writer", "model": "openai/gpt-4o-mini" }
    ]
  }
}

Multi-Agent Setup Checklist

Before going live, run through this:

Each agent has its own workspace and agentDir for session isolation
Each agent’s tools.allow list is scoped to what it actually needs
System prompts (SOUL.md) are narrow and specific per agent — see the SOUL.md guide for templates
Background cron agents use free or cheap models
cron.maxConcurrentRuns is set appropriately for your workload
Cron retry config is set (maxAttempts, retryOn)
Per-agent session directories are monitored for cost tracking
You’ve tested each agent in isolation before combining
Orchestrator SOUL.md includes fallback instructions for sub-agent failures

Start with 3 agents. Get them working well. Then add more.

Multi-agent setups are not more complex than a single agent — they’re just more explicit about what work goes where. That explicitness is the point. When you know exactly which agent does what, you can optimize, monitor, and fix each part independently.

The setups in this guide run reliably on xCloud’s managed OpenClaw hosting, which handles server uptime, SSL, and automatic backups. If you want the agents without the server ops, that’s the fastest way to start.

Looking to cut API costs for your multi-agent setup? Read OpenClaw Cost Optimization: How to Cut Your Monthly Bill by 90% — it covers model routing, context management, and caching strategies that apply directly to multi-agent configs.