从OpenClaw和Claude code的设计上学习Agent编排
OpenClaw
subagent编排
OpenClaw里内置了子agent机制,核心就是sessions_spawn和subagents两个工具,它们会被加入OpenClaw的工具集合里,都在src/agents/tools目录下,也就是说,OpenClaw的多agent编排不是硬编码好的固定supervisor-worker流程,顶层agent拿到一套工具,它自己决定要不要调用sessions_spawn;一旦调用,当前会话就临时扮演orchestrator,被spawn出来的子会话再根据深度变成orchestrator或leaf。
OpenClaw的agent有三个角色:
- main:顶层会话;
- orchestrator:还能继续派生子agent的中间层;
- leaf:叶子节点,不能再往下派生。
export const SUBAGENT_SESSION_ROLES = ["main", "orchestrator", "leaf"] as const; |
export type SubagentSessionRole = (typeof SUBAGENT_SESSION_ROLES)[number]; |
export const SUBAGENT_CONTROL_SCOPES = ["children", "none"] as const; |
export type SubagentControlScope = (typeof SUBAGENT_CONTROL_SCOPES)[number]; |
type SessionCapabilityEntry = { |
sessionId?: unknown; # 允许按 session id 反查 entry |
spawnDepth?: unknown; # 当前是第几层派生 |
subagentRole?: unknown; # 持久化的角色,main | orchestrator | leaf |
subagentControlScope?: unknown; # 持久化的控制范围,children | none |
}; |
sessions_spawn是创建/启动工具,它负责新开一个隔离session,可以开OpenClaw subagent和ACP runtime session,它的参数里有task、agentId、runtime、thread、mode、model、thinking等,真正会去调用spawnSubagentDirect()或 spawnAcpDirect(),所以它的本质是spawn一个新的子会话。subagents是管理已存在subagent的工具,它不负责创建,只负责对当前requester session下面已经跑起来的subagent做list、kill、steer。
是否开启subagent这里是三层共同决定,LLM最终是否调用工具由模型根据提示词和上下文决定要不要调用 sessions_spawn:
- 用户是否显式要求:你可以直接用/subagents spawn ...,这时根本不是让LLM自己决定,而是用户直接触发。
- 工具是否可用:当前run里必须真的有sessions_spawn这个工具;OpenClaw会把它注入工具集,如果工具被policy禁掉,prompt再怎么写也没用。
- prompt是否鼓励:主系统提示词里确实有明确引导:If a task is more complex or takes longer, spawn a sub-agent.
subagent上下文
在src/agents/subagent-spawn.ts:512中,OpenClaw调buildSubagentSystemPrompt()生成专门给子agent的附加系统提示extraSystemPrompt,这个prompt里会写清楚:你是subagent、你的父级是谁、你的角色是什么、你该如何收尾、能不能继续spawn子agent、不要busy-poll、结果会auto-announce回去。再往下,子会话跑自己的embedded agent时,OpenClaw会把extraSystemPrompt混进完整系统提示词,在minimal模式下,它会被放到## Subagent Context这一节里。因为subagent是一个新的独立session,所以它默认拿到的是:新session的上下文、父级传下来的任务描述、父级传下来的subagent专用system prompt、同一workspace/agent环境下的bootstrap与技能上下文。
下面给一个例子,如果我们给OpenClaw指令:“帮我分析openclaw里subagent相关实现,找出sessions_spawn和subagents的职责区别;如果需要,可以并行让一个subagent去读相关代码并给我摘要。”
主agent的Promt如下:
You are a personal assistant running inside OpenClaw. |
## Tooling |
Tool availability (filtered by policy): |
- read: ... |
- write: ... |
- edit: ... |
- apply_patch: ... |
- exec: ... |
- process: ... |
- sessions_list: ... |
- sessions_history: ... |
- sessions_send: ... |
- sessions_spawn: Spawn an isolated sub-agent session |
- subagents: List, steer, or kill sub-agent runs |
- session_status: ... |
- browser: ... |
- web_search: ... |
- ... |
TOOLS.md does not control tool availability; it is user guidance for how to use external tools. |
For long waits, avoid rapid poll loops ... |
If a task is more complex or takes longer, spawn a sub-agent. Completion is push-based: it will auto-announce when done. |
Do not poll `subagents list` / `sessions_list` in a loop ... |
## Tool Call Style |
Default: do not narrate routine, low-risk tool calls ... |
When a first-class tool exists for an action, use the tool directly ... |
## Safety |
... |
## OpenClaw CLI Quick Reference |
... |
## Skills |
[这里会插入 skills prompt] |
## Workspace |
Your working directory is: /.../workspace |
Treat this directory as the single global workspace ... |
## Current Date & Time |
Time zone: ... |
## Workspace Files (injected) |
These user-editable files are loaded by OpenClaw ... |
## Messaging |
- Reply in current session ... |
- Cross-session messaging → use sessions_send(sessionKey, message) |
- Sub-agent orchestration → use subagents(action=list|steer|kill) |
## Project Context |
[这里会注入 AGENTS.md / SOUL.md / USER.md / 其他上下文文件内容] |
subagent的Prompt如下:
You are a personal assistant running inside OpenClaw. |
## Tooling |
Tool availability (filtered by policy): |
- read: ... |
- exec: ... |
- process: ... |
- sessions_spawn: ... |
- subagents: ... |
- session_status: ... |
- ... |
TOOLS.md does not control tool availability ... |
For long waits, avoid rapid poll loops ... |
If a task is more complex or takes longer, spawn a sub-agent. Completion is push-based: it will auto-announce when done. |
Do not poll `subagents list` / `sessions_list` in a loop ... |
## Tool Call Style |
... |
## Safety |
... |
## Workspace |
Your working directory is: /.../workspace |
## Current Date & Time |
Time zone: ... |
## Workspace Files (injected) |
These user-editable files are loaded by OpenClaw ... |
## Subagent Context |
# Subagent Context |
You are a **subagent** spawned by the main agent for a specific task. |
## Your Role |
- You were created to handle: 阅读 `src/agents/tools/sessions-spawn-tool.ts`、`src/agents/tools/subagents-tool.ts`、`src/agents/subagent-capabilities.ts`,总结三者关系 |
- Complete this task. That's your entire purpose. |
- You are NOT the main agent. Don't try to be. |
## Rules |
1. Stay focused |
2. Complete the task |
3. Don't initiate |
4. Be ephemeral |
5. Trust push-based completion |
6. Recover from compacted/truncated tool output |
## Output Format |
- What you accomplished or found |
- Any relevant details the main agent should know |
- Keep it concise but informative |
## What You DON'T Do |
- NO user conversations |
- NO pretending to be the main agent |
- Only use the `message` tool when explicitly instructed ... |
## Sub-Agent Spawning |
You CAN spawn your own sub-agents ... // 只有还能继续派生时才有 |
或 |
You are a leaf worker and CANNOT spawn further sub-agents. // leaf 时出现 |
## Session Context |
- Label: code-reader |
- Requester session: agent:main:main |
- Requester channel: discord |
- Your session: agent:main:subagent:abcd-efgh |
tool调用失败的兜底策略
看到这里,就会发现OpenClaw对于Agent的编排简单到有点粗暴,就是让模型自己调用工具开启一个subagent,那么不可避免地就会tool调用失败,OpenClaw对于tool调用失败的机制包括下面三个:
pi-tools.before-tool-call.ts
在工具调用之前,OpenClaw就会进行三个判断:
- 做loop detection:在有sessionKey的情况下,它会懒加载runtime依赖,然后拿当前session的诊断状态做
detectToolCallLoop(),判断是否陷入工具调用循环。 - 跑插件的before_tool_call hook:如果全局hook runner里注册了before_tool_call,它会把当前toolName、参数、runId、toolCallId和session上下文一起传进去,hook可以返回两种效果,
block直接拦截和params改写参数。 - 记录真正执行的参数和执行结果:被hook改过的参数会按runId + toolCallId存到一个内存Map里,执行成功或失败后,它还会调用
recordLoopOutcome()把结果写回loop detection状态里。
pi-tool-definition-adapter.ts
这个文件负责把 OpenClaw 内部工具适配成pi-coding-agent需要的ToolDefinition,入口是toToolDefinitions()。很多内部工具未必严格返回标准AgentToolResult,所以它会在执行后调用normalizeToolExecutionResult(),这层的规则是:
- 如果结果已经有
content[],直接当标准结果返回; - 如果没有
content[],就强制包成:content: [{ type: "text", text: ... }] details: ...; - 如果工具只返回字符串、数字、普通对象,也都能被兜成合法结果。
如果tool.execute()抛错,它会:
- 如果是signal.aborted或AbortError:直接rethrow,表示这是run取消,不是普通工具失败;
- 否则,先提取错误message/stack,记debug stack 和error log,返回一个 jsonResult(...) 包出来的结构化结果,形如:
{ |
"status": "error", |
"tool": "xxx", |
"error": "..." |
} |
失败兜底链路,串起来就是:
- tool先经过
wrapToolWithBeforeToolCallHook(),见 src/agents/pi-tools.ts:609; - before_tool_call阶段先做loop detection和plugin hook,见 src/agents/pi-tools.before-tool-call.ts:89;
- 如果只是hook出错,warning后继续走原参数,见 src/agents/pi-tools.before-tool-call.ts:186;
- 真正执行工具时,adapter统一把返回值规范成 AgentToolResult,见 src/agents/pi-tool-definition-adapter.ts:78;
- 如果执行异常但不是abort,就返回status: "error"的结构化结果,见 src/agents/pi-tool-definition-adapter.ts:185;
- LLM收到这个错误结果,再决定重试、换工具、或者降级处理。
Claude Code
subagent编排
Claude Code的底层内核是single-agent loop,当主agent认为需要委派并发出AgentTool的tool_use时,才会产生普通subagent。模型输出一个tool_use(name=AgentTool, ...),query()把tool_use交给runTools()执行,AgentTool.call()解析参数,决定是teammate、普通subagent,还是fork subagent。普通subagent最终进入runAgent(),runAgent()会创建subagent context,然后再次调用一遍query(),所以本质上是子代理也跑同一套query loop。如果FORK_SUBAGENT开启,AgentTool不传 subagent_type会走fork path,生成一种继承父上下文的forked subagent。AgentTool所在的目录是:

cc的subagent包括三类:
- 普通的subagent:system prompt来自
agentDefinition.getSystemPrompt(),再经过enhanceSystemPromptWithEnvDetails()补上环境细节、绝对路径要求、不要emoji等说明。它收到的初始user message,就是AgentTool传进去的prompt字符串:
romptMessages = [createUserMessage({ |
content: prompt |
})];。 |
- 内置subagent的system prompt是各自独立定义的,例如Explore Agent,定义在
tools/AgentTool/built-in/exploreAgent.ts严格只读、搜索导向,它的Prompt是:
import { BASH_TOOL_NAME } from 'src/tools/BashTool/toolName.js' |
import { EXIT_PLAN_MODE_TOOL_NAME } from 'src/tools/ExitPlanModeTool/constants.js' |
import { FILE_EDIT_TOOL_NAME } from 'src/tools/FileEditTool/constants.js' |
import { FILE_READ_TOOL_NAME } from 'src/tools/FileReadTool/prompt.js' |
import { FILE_WRITE_TOOL_NAME } from 'src/tools/FileWriteTool/prompt.js' |
import { GLOB_TOOL_NAME } from 'src/tools/GlobTool/prompt.js' |
import { GREP_TOOL_NAME } from 'src/tools/GrepTool/prompt.js' |
import { NOTEBOOK_EDIT_TOOL_NAME } from 'src/tools/NotebookEditTool/constants.js' |
import { hasEmbeddedSearchTools } from 'src/utils/embeddedTools.js' |
import { AGENT_TOOL_NAME } from '../constants.js' |
import type { BuiltInAgentDefinition } from '../loadAgentsDir.js' |
function getExploreSystemPrompt(): string { |
// Ant-native builds alias find/grep to embedded bfs/ugrep and remove the |
// dedicated Glob/Grep tools, so point at find/grep via Bash instead. |
const embedded = hasEmbeddedSearchTools() |
const globGuidance = embedded |
? `- Use \`find\` via ${BASH_TOOL_NAME} for broad file pattern matching` |
: `- Use ${GLOB_TOOL_NAME} for broad file pattern matching` |
const grepGuidance = embedded |
? `- Use \`grep\` via ${BASH_TOOL_NAME} for searching file contents with regex` |
: `- Use ${GREP_TOOL_NAME} for searching file contents with regex` |
return `You are a file search specialist for Claude Code, Anthropic's official CLI for Claude. You excel at thoroughly navigating and exploring codebases. |
=== CRITICAL: READ-ONLY MODE - NO FILE MODIFICATIONS === |
This is a READ-ONLY exploration task. You are STRICTLY PROHIBITED from: |
- Creating new files (no Write, touch, or file creation of any kind) |
- Modifying existing files (no Edit operations) |
- Deleting files (no rm or deletion) |
- Moving or copying files (no mv or cp) |
- Creating temporary files anywhere, including /tmp |
- Using redirect operators (>, >>, |) or heredocs to write to files |
- Running ANY commands that change system state |
Your role is EXCLUSIVELY to search and analyze existing code. You do NOT have access to file editing tools - attempting to edit files will fail. |
Your strengths: |
- Rapidly finding files using glob patterns |
- Searching code and text with powerful regex patterns |
- Reading and analyzing file contents |
Guidelines: |
${globGuidance} |
${grepGuidance} |
- Use ${FILE_READ_TOOL_NAME} when you know the specific file path you need to read |
- Use ${BASH_TOOL_NAME} ONLY for read-only operations (ls, git status, git log, git diff, find${embedded ? ', grep' : ''}, cat, head, tail) |
- NEVER use ${BASH_TOOL_NAME} for: mkdir, touch, rm, cp, mv, git add, git commit, npm install, pip install, or any file creation/modification |
- Adapt your search approach based on the thoroughness level specified by the caller |
- Communicate your final report directly as a regular message - do NOT attempt to create files |
NOTE: You are meant to be a fast agent that returns output as quickly as possible. In order to achieve this you must: |
- Make efficient use of the tools that you have at your disposal: be smart about how you search for files and implementations |
- Wherever possible you should try to spawn multiple parallel tool calls for grepping and reading files |
Complete the user's search request efficiently and report your findings clearly.` |
} |
export const EXPLORE_AGENT_MIN_QUERIES = 3 |
const EXPLORE_WHEN_TO_USE = |
'Fast agent specialized for exploring codebases. Use this when you need to quickly find files by patterns (eg. "src/components/**/*.tsx"), search code for keywords (eg. "API endpoints"), or answer questions about the codebase (eg. "how do API endpoints work?"). When calling this agent, specify the desired thoroughness level: "quick" for basic searches, "medium" for moderate exploration, or "very thorough" for comprehensive analysis across multiple locations and naming conventions.' |
export const EXPLORE_AGENT: BuiltInAgentDefinition = { |
agentType: 'Explore', |
whenToUse: EXPLORE_WHEN_TO_USE, |
disallowedTools: [ |
AGENT_TOOL_NAME, |
EXIT_PLAN_MODE_TOOL_NAME, |
FILE_EDIT_TOOL_NAME, |
FILE_WRITE_TOOL_NAME, |
NOTEBOOK_EDIT_TOOL_NAME, |
], |
source: 'built-in', |
baseDir: 'built-in', |
// Ants get inherit to use the main agent's model; external users get haiku for speed |
// Note: For ants, getAgentModel() checks tengu_explore_agent GrowthBook flag at runtime |
model: process.env.USER_TYPE === 'ant' ? 'inherit' : 'haiku', |
// Explore is a fast read-only search agent — it doesn't need commit/PR/lint |
// rules from CLAUDE.md. The main agent has full context and interprets results. |
omitClaudeMd: true, |
getSystemPrompt: () => getExploreSystemPrompt(), |
} |
调用这个内置的subagent的示例是:
主 Agent |
↓ 决策 |
调用 sessions_spawn |
↓ |
指定 agentType = "Explore" |
↓ |
加载 EXPLORE_AGENT 定义 |
↓ |
调用 getSystemPrompt() |
↓ |
得到 prompt |
↓ |
启动 subagent(Explore agent) |
3.fork subagent:它不重新生成自己的agent system prompt,而是直接继承父agent已渲染好的system prompt,再把父assistant message、placeholder tool_results和当前directive拼成fork的输入消息。
tool调用失败的兜底策略
toolExecution.ts
每一个工具在写的时候都有自己的Input校验逻辑,如果校验结果为否,就将错误打日志,上报错误事件,把错误包装成tool_result返回给模型。
// Validate input values. Each tool has its own validation logic |
const isValidCall = await tool.validateInput?.( |
parsedInput.data, |
toolUseContext, |
) |
if (isValidCall?.result === false) { |
logForDebugging( |
`${tool.name} tool validation error: ${isValidCall.message?.slice(0, 200)}`, |
) |
logEvent('tengu_tool_use_error', { |
messageID: |
messageId as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS, |
toolName: sanitizeToolNameForAnalytics(tool.name), |
error: |
isValidCall.message as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS, |
errorCode: isValidCall.errorCode, |
isMcp: tool.isMcp ?? false, |
queryChainId: toolUseContext.queryTracking |
?.chainId as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS, |
queryDepth: toolUseContext.queryTracking?.depth, |
...(mcpServerType && { |
mcpServerType: |
mcpServerType as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS, |
}), |
...(mcpServerBaseUrl && { |
mcpServerBaseUrl: |
mcpServerBaseUrl as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS, |
}), |
...(requestId && { |
requestId: |
requestId as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS, |
}), |
...mcpToolDetailsForAnalytics(tool.name, mcpServerType, mcpServerBaseUrl), |
}) |
return [ |
{ |
message: createUserMessage({ |
content: [ |
{ |
type: 'tool_result', |
content: `<tool_use_error>${isValidCall.message}</tool_use_error>`, |
is_error: true, |
tool_use_id: toolUseID, |
}, |
], |
toolUseResult: `Error: ${isValidCall.message}`, |
sourceToolAssistantUUID: assistantMessage.uuid, |
}), |
}, |
] |
} |
真正执行tool.call()时出错,统一格式化并跑failure hooks,运行期异常不会直接把主loop打死,而是先格式化错误内容,特殊处理 MCP auth,把client状态改成needs-auth,跑PostToolUseFailure hooks,最终仍然返回一个tool_result()给模型,所以模型还能看到失败原因并继续下一步决策。
query.ts
代码995-1185行写了发生错误的处理方式,记录错误日志和埋点,如果是图片错误走专门处理,补齐缺失的tool_result,向用户暴露真实错误。
总结
可以看到OpenClaw和Claude code都没有采用中心式的Multi-agent架构,而是subagent as a tool,这个想法很值得学习。现在的很多开源项目,仍然会采用Router、Planner、Special Agent的架构,有过LangGraph开发经验的应该都可以感受到,这种架构的优势在于可控性很强,但是劣势很明显,上下文非常难以管理,每一个subagent的上下文和Ochestrator的上下文都很难编排。两个项目拆解给我的启发就是:subagent可以以skill或者Prompt的方式存在,按照需要的时候加载。当然,可控性仍然是Agent落地的最关键问题,所以为了实现这一点,可以在prompt里规范哪里需要开启subagent。
如果你有不同的看法,欢迎交流和补充!让我们一起学习!
本文作者:zcry
本文链接:https://www.cnblogs.com/zcry/p/19831663
版权声明:本作品采用知识共享署名-非商业性使用-禁止演绎 2.5 中国大陆许可协议进行许可。
合集: Agent
1
0
posted @ 2026-04-08 15:27 ZCry 阅读(445) 评论(0) 收藏 举报
登录后才能查看或发表评论,立即 登录 或者 逛逛 博客园首页
【推荐】 凌霞 618 年中大促,Halo 与 1Panel 产品全线半价,叠加满减!
【推荐】HarmonyOS 6.1.0 创新特性“悬浮页签+沉浸光感”精品文章专题
【推荐】科研领域的连接者艾思科蓝,一站式科研学术服务数字化平台
博客园 © 2004-2026
编辑浙公网安备 3301060201177
更多推荐


所有评论(0)