What Agent Teams Are
One Claude Code session is good. Five working together on the same project is different.
Agent Teams just shipped with Opus 4.6. You spin up multiple Claude Code instances, each with its own context window, and they coordinate autonomously—messaging each other, claiming tasks, and reporting back to a lead agent.
Here's how it works and when it's actually worth the token cost.
What Agent Teams Are
A team has three parts:
- Team Lead: Your main Claude Code session. It creates the team, spawns teammates, assigns tasks, and synthesizes results.
- Teammates: Separate Claude Code instances. Each one has its own context window, loads project context (
CLAUDE.md, MCP servers, skills), and works independently. - Shared Task List: A central list of work items. Tasks have three states: pending, in progress, completed. Tasks can depend on other tasks—blocked work unblocks automatically when dependencies finish.
The key difference from subagents: teammates talk to each other.
A subagent reports back to the main agent and that's it. Agent team members message directly, challenge each other's findings, and self-coordinate. This is true Agentic Orchestration.
Comparison & Setup
Agent Teams vs Subagents: When to Use Which
This is the decision that matters. Wrong choice = wasted tokens. Avoid the Agent Swarm Trap.
| Feature | Subagents | Agent Teams |
|---|---|---|
| Context | Own window, results return to caller | Own window, fully independent |
| Communication | Reports back to main agent only | Teammates message each other directly |
| Coordination | Main agent manages everything | Shared task list, self-coordination |
| Token Cost | Lower—results summarized back | Higher—each teammate is a full Claude instance |
| Best For | Focused tasks where only the result matters | Complex work requiring discussion and collaboration |
Use subagents when: You need quick, focused workers that report back. "Go research X and tell me what you find."
Use agent teams when: Workers need to share findings, challenge each other, and coordinate on their own. "Investigate this bug from three angles and debate which theory is correct."
Setup: 2 Minutes
Agent Teams are experimental and disabled by default. One setting to flip:
Option 1: settings.json
{
"env": {
"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
}
}
Option 2: Environment Variable
export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
That's it. Now tell Claude to create a team in natural language:
I'm refactoring the auth module. Create an agent team:
- One teammate on the backend JWT logic
- One on the frontend session handling
- One writing integration tests
Claude spawns the team, creates a shared task list, and starts coordinating.
Modes, Use Cases & Tips
Display Modes
Two options for how you see your team:
- In-process (default): All teammates run inside your terminal. Use
Shift+Up/Downto select a teammate. PressEnterto view their session,Escapeto interrupt. Works everywhere. - Split panes: Each teammate gets its own terminal pane. You see everyone's output simultaneously. Requires
tmuxoriTerm2.
Set it in settings.json:
{
"teammateMode": "in-process"
}
Or per session: claude --teammate-mode in-process.
Split panes look cooler but in-process works in any terminal. Start there.
The Best Use Cases
After reading through the docs and the feature capabilities, these are the setups that make the token cost worthwhile. Similar to how I structured my Claude Code Architecture, this is about scaling leverage.
1. Parallel Code Review
One reviewer gravitates toward one type of issue. Three reviewers with different lenses catch more. Create an agent team to review PR #142:
- Security reviewer: token handling, input validation, auth flows.
- Performance reviewer: N+1 queries, memory leaks, unnecessary renders.
- Test reviewer: coverage gaps, edge cases, flaky test patterns.
The lead synthesizes everything.
2. Debugging with Competing Hypotheses
This is the killer use case. Single agents find one plausible explanation and stop looking. Multiple agents arguing with each other find the right explanation.
Spawn 5 teammates to investigate different hypotheses. Have them talk to each other to disprove each other's theories. Update findings with whatever consensus emerges. Parallel investigation with debate surfaces the strongest theory.
3. Multi-Module Feature Work
When a feature spans frontend, backend, and tests—each teammate owns a different layer. No file conflicts, no stepping on each other.
- Teammate 1: Backend API endpoints and database schema.
- Teammate 2: Frontend components and state management.
- Teammate 3: E2E tests and integration tests.
Pro Tips
- Require plan approval for risky tasks: Teammates work in read-only plan mode until the lead approves their approach.
- Use delegate mode: When the lead starts coding instead of coordinating -> Press
Shift+Tabto lock it into orchestration-only (spawning, messaging, task management). No touching code. - Give teammates specific context: They load
CLAUDE.mdautomatically but don't inherit the lead's conversation history. Put task-specific details in the spawn prompt. - Avoid file conflicts: Two teammates editing the same file = overwrites. Break work so each teammate owns different files.
Limitations & Verdict
The Honest Limitations
This is experimental. Know what you're getting into:
- No session resumption:
/resumeand/rewinddon't restore in-process teammates. If you resume, the lead may try to message teammates that no longer exist. - Token cost is real: A 5-person team burns roughly 5x the tokens of a single session. For routine tasks, this isn't worth it.
- One team per session: Clean up the current team before starting a new one. Teammates can't spawn their own teams.
- Split panes need tmux/iTerm2: Doesn't work in VS Code integrated terminal.
Cleanup
When you're done: Clean up the team.
The lead checks for active teammates and fails if any are still running. Shut them down first: "Ask the researcher teammate to shut down".
Always clean up through the lead. If a tmux session persists: tmux kill-session -t <session-name>.
The Verdict
Agent Teams are the most interesting feature in the Opus 4.6 release—and the most expensive.
For code reviews, adversarial debugging, and multi-module features: the parallel exploration genuinely finds things a single agent misses. The competing hypotheses pattern for debugging alone is worth learning.
For sequential tasks or same-file edits: stick with subagents or a single session. The overhead isn't justified.
We are moving away from static flows. This is another proof point why Static Middleware is Dead. Start with a read-only task—a code review—before you commit to parallelized implementation work.