docs(agent): 补充 Phase 4 Agent Runtime 方案文档与设计决策记录

2026-06-09 22:32:59 +08:00
parent 63c50e1fc7
commit 336920554a
2 changed files with 982 additions and 0 deletions
@@ -0,0 +1,647 @@
+# Agent Runtime 方案设计
+
+> 设计日期：2026-06-09
+> 状态：待实施
+> 关联文档：
+> - `docs/note-agent-runtime-design.md` — 设计决策记录（接口签名、文件清单、决策依据）
+> - `docs/note-agent-harness-references.md` — 参考项目调研（OpenClaw / Hermes / OpenHuman / OpenHarness）
+> - `docs/6-memory-system.md` — Phase 3 方案
+> - `docs/5-tool-system.md` — Phase 2 方案
+> - `docs/roadmap.md` — 项目总 Roadmap
+
+---
+
+## 1. 背景与目标
+
+### 1.1 背景
+
+AG Core 已完成 Phase 0（LLM 调用周期）、Phase 1（提示词工程）、Phase 2（工具系统）、Phase 3（记忆系统）共 4 个 phase 的交付。`LlmCycle::submit_with_tools()` 已在 Phase 2 末实现"LLM 决策 → 工具执行 → 回传结果"的单次循环；`ConversationMemory` / `KnowledgeStore` / `MemoryRetriever` 在 Phase 3 提供了完整的记忆抽象。
+
+当前缺一个**整合层**：把 Phase 0-3 的能力"装配"起来，对上层应用暴露"智能体"的概念。
+
+### 1.2 目标
+
+Phase 4 目标是提供一个**薄胶水层 + 一组 trait 抽象**，让上层应用可以基于 AG Core 构建多轮对话、任务规划等智能体行为。具体包括：
+
+- **`Agent` trait** — 智能体的"角色"抽象（不绑定 session）
+- **`AgentSession` struct** — 智能体的"会话"实例（绑定 session_id + 状态）
+- **`TaskAgent` trait** — 任务型智能体的"规划/执行"抽象
+- **`RuntimeBundle`** — 显式依赖注入容器，集中管理 provider/registry/hook/memory 等依赖
+- **`AgentBuilder`** — 链式构造入口
+- **`AgentError`** — 统一错误类型，聚合 LlmError / ToolError / MemoryError
+
+### 1.3 设计原则
+
+Phase 4 严格遵循以下原则，所有范围决策都基于这些原则推导：
+
+| 原则 | 含义 | 推导 |
+|------|------|------|
+| **最小范围** | AG Core 是 lib crate，不是产品；不实现业务循环 | 只暴露 trait + 最小 reference impl |
+| **薄胶水层** | 不在 L1 重写已经做好的能力 | 复用 `LlmCycle::submit_with_tools` 等已有 API |
+| **依赖注入** | 所有运行时依赖显式打包传递 | 采用 OpenHarness `RuntimeBundle` 模式 |
+| **实体/会话分离** | 同一角色可被多 session 复用 | `Agent` + `AgentSession` 两层模型 |
+| **记忆弱引用** | 记忆是"被动能力"，不内嵌循环 | `memory_store: Option<Arc<dyn MemoryStore>>` 弱引用 |
+| **业务可注入** | Plan 拆解是业务能力，不在 core 库实现 | 暴露 `PlanParser` trait，上层注入 |
+| **借鉴不照搬** | 4 个参考项目均非 Rust 实现 | 只取架构模式，不抄实现细节 |
+
+### 1.4 与已完成的 Phase 关系
+
+```
+Phase 0  (L0/L1) ──  LlmProvider / LlmCycle / Hook / Stream / Compact
+Phase 1  (L2)     ──  PromptTemplate / PromptComposer
+Phase 2  (L1)     ──  ToolRegistry / BaseTool / PermissionChecker / McpClient
+Phase 3  (L2)     ──  MemoryStore / ConversationMemory / KnowledgeStore / MemoryRetriever
+                        ↑
+                        │ 复用
+                        │
+Phase 4  (L1→L2)  ──  Agent trait + AgentSession + TaskAgent + RuntimeBundle（胶水层）
+                        ↓
+应用层  (L4)       ──  上层 crate / 二进制 / Gateway（不在 Phase 4 范围）
+```
+
+详细架构对照见 `docs/note-agent-harness-references.md` §3-5。
+
+## 2. 需求分析
+
+### 2.1 功能需求
+
+| ID | 需求 | 优先级 | 说明 |
+|----|------|--------|------|
+| F1 | `Agent` trait 抽象 | P0 | 角色定义：name / system_prompt / 工具集 |
+| F2 | `AgentSession` 会话实例 | P0 | 绑定 session_id、bundle、turn_index、cost_so_far |
+| F3 | `submit_turn()` 最小 reference impl | P0 | 组装 LlmCycle → submit → 累计 cost；约 30 行 |
+| F4 | `TaskAgent::run(goal)` 自主式入口 | P0 | 内部用 LLM 拆 Plan，再调用 `execute_plan` |
+| F5 | `TaskAgent::execute_plan(plan)` 外部驱动式入口 | P0 | 用户预定义 Plan，逐步执行 |
+| F6 | `Plan` / `Step` / `StepStatus` 数据结构 | P0 | 含 Pending / Running / Completed / Failed / Skipped 状态机 |
+| F7 | `PlanParser` trait + `JsonPlanParser` 参考实现 | P0 | 注入式，上层可替换 |
+| F8 | `RuntimeBundle` 依赖注入容器 | P0 | 聚合 provider/registry/hook/memory/retriever/config |
+| F9 | `AgentBuilder` 链式构造 | P0 | 构建 `RuntimeBundle`，retriever 存在时自动注册为 tool |
+| F10 | `AgentError` 统一错误类型 | P0 | 聚合 LlmError / ToolError / MemoryError，含 `is_recoverable()` |
+| F11 | Hook 事件扩展：OnTurnStart / OnTurnEnd / OnPlanStepComplete | P0 | 在 `llm/hooks.rs` 中追加 3 个事件 + 上下文扩展 2 个字段 |
+| F12 | 烟雾测试 2-3 个 | P0 | trait 可装配 / RuntimeBundle 可构造 / `submit_turn` 跑通 mock |
+| F13 | `lib.rs` 导出 `pub mod agent;` | P0 | 一行 |
+| F14 | 方案文档（本文件）+ 决策记录 | P0 | 已完成 |
+| F15 | Roadmap 状态翻转 | P0 | 实施完成后做 |
+
+### 2.2 非功能需求
+
+| ID | 需求 | 说明 |
+|----|------|------|
+| NF1 | 不引入新外部依赖 | 仅使用 Phase 0-3 已有的 `async-trait` / `serde` / `thiserror` / `tokio` 等 |
+| NF2 | 错误体系完善 | `AgentError` 聚合下层错误，含 `is_recoverable()` 分类 |
+| NF3 | 线程安全 | 所有公开类型满足 `Send + Sync` |
+| NF4 | 异步优先 | 涉及 IO 的 API 全部 `async` |
+| NF5 | 模块化 | 各组件独立可替换，遵循"trait 抽象 + 轻量默认实现"惯例 |
+| NF6 | 文档注释 | 所有公开 API 必须有 `///` 文档注释 |
+| NF7 | builder 模式 | 复杂配置走 builder 链式构造 |
+| NF8 | 显式依赖 | 不引入模块级全局状态，所有依赖通过参数或 bundle 注入 |
+| NF9 | 不破坏现有 API | Phase 0-3 的公开 API 一字不改；`hooks.rs` 扩展为"追加变体 + 追加字段"（兼容） |
+| NF10 | 最小测试覆盖 | 核心 trait 至少 1 个烟雾测试；`submit_turn` 至少 1 个 mock 测试；不强求集成测试 |
+
+## 3. 方案设计
+
+### 3.1 总体架构
+
+```
+┌──────────────────────────────────────────────────────────────────────┐
+│                            应用层（不在 Phase 4 范围）                 │
+│   ┌────────────┐   ┌────────────┐   ┌────────────┐   ┌────────────┐    │
+│   │ CLI Agent  │   │ Feishu Bot │   │ Web Service│   │  TUI App   │   │
+│   └─────┬──────┘   └─────┬──────┘   └─────┬──────┘   └─────┬──────┘    │
+└─────────┼────────────────┼────────────────┼────────────────┼───────────┘
+          │                │                │                │
+          └────────────────┴────────────────┴────────────────┘
+                                    │
+                                    ▼
+┌──────────────────────────────────────────────────────────────────────┐
+│                     Agent Runtime（Phase 4）                          │
+│                                                                       │
+│   ┌────────────────┐         ┌──────────────────┐                     │
+│   │  Agent trait   │ 1 ──── * │  AgentSession    │                     │
+│   │  (角色)        │         │  (会话实例)       │                     │
+│   └────────────────┘         └──────┬───────────┘                     │
+│                                      │ Arc<...>                       │
+│                                      ▼                                │
+│                             ┌──────────────────┐                      │
+│                             │  RuntimeBundle   │                      │
+│                             │  - provider      │                      │
+│                             │  - tool_registry │                      │
+│                             │  - hook_executor │                      │
+│                             │  - memory_store? │ ◄─ 弱引用            │
+│                             │  - retriever?    │ ◄─ 弱引用            │
+│                             │  - config        │                      │
+│                             └──────┬───────────┘                      │
+│                                    │ new() 时若 retriever 存在        │
+│                                    ▼                                  │
+│                             ┌──────────────────┐                      │
+│                             │ "retrieve" tool  │ ◄─ 自动注册           │
+│                             └──────────────────┘                      │
+│                                                                       │
+│   ┌────────────────┐   ┌──────────────────┐   ┌──────────────────┐    │
+│   │ TaskAgent trait│   │ Plan/Step/Status │   │ PlanParser trait │    │
+│   │  run()         │   │  状态机          │   │ JsonPlanParser   │    │
+│   │  execute_plan()│   │                  │   │ (参考实现 ~20行) │    │
+│   └────────────────┘   └──────────────────┘   └──────────────────┘    │
+│                                                                       │
+│   ┌────────────────┐   ┌──────────────────┐                          │
+│   │  AgentError    │   │  AgentBuilder    │                          │
+│   │  (聚合)        │   │  (链式构造)       │                          │
+│   └────────────────┘   └──────────────────┘                          │
+└──────────────────────────────────────────────────────────────────────┘
+                                    │
+                                    ▼ 复用
+┌──────────────────────────────────────────────────────────────────────┐
+│                  LLM / Tool / Prompt / Memory（Phase 0-3）            │
+│   LlmCycle / ProviderRegistry / ToolRegistry / PermissionChecker /   │
+│   HookExecutor / StreamEvents / CompactConfig /                      │
+│   PromptTemplate / PromptComposer /                                  │
+│   MemoryStore / ConversationMemory / KnowledgeStore / MemoryRetriever│
+└──────────────────────────────────────────────────────────────────────┘
+```
+
+### 3.2 接口设计
+
+详细接口签名见 `docs/note-agent-runtime-design.md` §4，本节说明设计意图。
+
+#### 3.2.1 `Agent` trait
+
+```rust
+pub trait Agent: Send + Sync {
+    fn name(&self) -> &str;
+    fn system_prompt(&self) -> Option<&str>;
+    /// 列出该 Agent 想要暴露给 LLM 的工具定义。
+    /// 默认实现：从 RuntimeBundle.tool_registry 取全部（最常用）。
+    /// 子 trait 可覆盖做白名单/过滤。
+    fn tool_definitions(&self, bundle: &RuntimeBundle) -> Vec<ToolDefinition>;
+}
+```
+
+**设计意图**：
+- `name` / `system_prompt` 是 LLM 调用必需的元数据
+- `tool_definitions` 默认从 bundle 全量取，**Agent 可以在不修改 bundle 的情况下做工具白名单**——这与 Hermes 的"Skill 暴露"机制对齐
+- 不在 trait 里强制 `submit_turn`——`submit_turn` 是 `AgentSession` 的方法，不应绑死在角色定义上
+
+#### 3.2.2 `RuntimeBundle`
+
+```rust
+pub struct RuntimeBundle {
+    pub provider: Arc<dyn LlmProvider>,
+    pub tool_registry: Arc<ToolRegistry>,
+    pub hook_executor: Arc<HookExecutor>,
+    pub memory_store: Option<Arc<dyn MemoryStore>>,   // 弱引用
+    pub retriever: Option<Arc<MemoryRetriever>>,       // 弱引用
+    pub config: AgentConfig,
+}
+```
+
+**设计意图**：
+- 所有运行时依赖**显式打包**（OpenHarness 风格）
+- `memory_store` / `retriever` 均为 `Option`——上层应用**不传也能跑**（无记忆模式）
+- 当 `retriever` 存在时，`RuntimeBundle::new()` 内部自动注册一个名为 `"retrieve"` 的 tool（具体实现：在 `ToolRegistry` 里加一个 `RetrieveTool` 包装），让 LLM 在对话中**主动**调用检索能力
+- `config` 集中管理所有可调参数（max_turns、max_tool_turns、session_ttl、compact_config）
+
+#### 3.2.3 `AgentSession` 与最小 reference impl
+
+```rust
+pub struct AgentSession {
+    pub session_id: String,
+    pub agent_name: String,
+    bundle: Arc<RuntimeBundle>,
+    turn_index: u32,
+    cost_so_far: CostTracker,
+}
+
+impl AgentSession {
+    /// 最小 reference impl（约 30 行）：
+    /// 1. 触发 OnTurnStart hook
+    /// 2. 组装 LlmCycle（注入 system_prompt + messages 历史 + tool definitions）
+    /// 3. submit_with_tools() 跑单轮对话
+    /// 4. 累计 cost
+    /// 5. 触发 OnTurnEnd hook
+    /// 6. turn_index += 1
+    /// 7. 返回 ChatResponse
+    /// 不做 memory 回写（由上层独立 task 处理）
+    pub async fn submit_turn(
+        &mut self,
+        user_input: impl Into<String>,
+    ) -> Result<ChatResponse, AgentError>;
+}
+```
+
+**设计意图**：
+- "最小 reference impl" 只演示**最常见**的对话场景
+- 业务循环（多轮策略、错误重试、记忆回写时机）由上层应用或具体的 `TaskAgent` 实现决定
+- `submit_turn` 不持有 `ConversationMemory`——上层应用可独立 new 一个 `ConversationMemory`，在合适的时机（如 OnTurnEnd hook）调 `add_message`
+
+#### 3.2.4 `TaskAgent` + `Plan` / `Step`
+
+```rust
+pub struct Plan {
+    pub id: String,
+    pub goal: String,
+    pub steps: Vec<Step>,
+}
+
+pub struct Step {
+    pub index: usize,
+    pub description: String,
+    pub status: StepStatus,
+}
+
+pub enum StepStatus {
+    Pending,
+    Running,
+    Completed(ChatResponse),
+    Failed(AgentError),
+    Skipped,
+}
+```
+
+**设计意图**：
+- `StepStatus` 用 enum 而非简单 bool，便于上层 UI 展示和统计
+- 状态机转换：`Pending → Running → (Completed | Failed | Skipped)`，单向不可回退（重试由上层新建 Plan）
+- `Plan` / `Step` 故意保持简单——不引入 `dependencies` / `parallel_group` 等高级字段（v0.3+ 再考虑）
+
+#### 3.2.5 `PlanParser` trait + `JsonPlanParser` 参考实现
+
+```rust
+#[async_trait]
+pub trait PlanParser: Send + Sync {
+    async fn parse(&self, raw: &str, goal: &str) -> Result<Plan, AgentError>;
+}
+
+pub struct JsonPlanParser;
+#[async_trait]
+impl PlanParser for JsonPlanParser {
+    /// 期望 LLM 输出形如：
+    /// {"steps": [{"description": "..."}, ...]}
+    /// 的 JSON 文本。
+    /// 解析失败返回 AgentError::PlanParse。
+    async fn parse(&self, raw: &str, goal: &str) -> Result<Plan, AgentError> { /* ... */ }
+}
+```
+
+**设计意图**：
+- **注入式**：上层应用可以注入自己的 `PlanParser`（如基于 XML / YAML / 自定义 DSL）
+- `JsonPlanParser` 是**参考实现**，不是默认实现——上层必须显式选择
+- `JsonPlanParser` 大约 20 行：`serde_json::from_str` 解析 + 字段映射
+
+#### 3.2.6 `AgentError`
+
+```rust
+pub enum AgentError {
+    Llm(LlmError),
+    Tool(ToolError),
+    Memory(MemoryError),
+    PlanParse(String),
+    HookBlocked(String),
+    LimitExceeded(String),
+    Config(String),
+    Other(String),
+}
+```
+
+**设计意图**：
+- 聚合而非包装下层错误（避免 `Box<dyn Error>` 丢失类型）
+- `PlanParse` / `HookBlocked` / `LimitExceeded` / `Config` 是 Agent 层特有的错误类型
+- `is_recoverable()` 根据变体类型判定（如 `Memory(_)` 可恢复、`PlanParse(_)` 不可恢复）
+
+#### 3.2.7 `AgentConfig` + `AgentBuilder`
+
+```rust
+pub struct AgentConfig {
+    pub max_turns: u32,
+    pub max_tool_turns: u32,
+    pub session_ttl: Option<Duration>,
+    pub compact_config: Option<CompactConfig>,
+}
+
+pub struct AgentBuilder { /* ... */ }
+impl AgentBuilder {
+    pub fn new() -> Self;
+    pub fn provider(self, p: Arc<dyn LlmProvider>) -> Self;
+    pub fn tool_registry(self, r: Arc<ToolRegistry>) -> Self;
+    pub fn hook_executor(self, h: Arc<HookExecutor>) -> Self;
+    pub fn memory_store(self, m: Arc<dyn MemoryStore>) -> Self;     // 选填
+    pub fn retriever(self, r: Arc<MemoryRetriever>) -> Self;        // 选填
+    pub fn config(self, c: AgentConfig) -> Self;
+    pub fn build(self) -> Result<RuntimeBundle, AgentError>;
+}
+```
+
+**设计意图**：
+- `AgentBuilder` 是**唯一**的 `RuntimeBundle` 构造入口
+- 必填字段在 `build()` 时校验（`provider` / `tool_registry` / `hook_executor` 不可缺）
+- `memory_store` / `retriever` 选填，对应 §3.2.2 的"无记忆模式"
+
+### 3.3 状态机
+
+#### 3.3.1 `StepStatus` 状态转换图
+
+```
+                  ┌─────────────┐
+                  │   Pending   │  ◄── 初始状态
+                  └──────┬──────┘
+                         │ execute_plan() 进入
+                         ▼
+                  ┌─────────────┐
+                  │   Running   │  ◄── 触发 OnPlanStepComplete（status=Running）
+                  └──────┬──────┘
+                         │
+        ┌────────────────┼────────────────┐
+        │                │                │
+        ▼                ▼                ▼
+   ┌─────────┐      ┌──────────┐    ┌──────────┐
+   │Completed│      │  Failed  │    │ Skipped  │
+   └─────────┘      └──────────┘    └──────────┘
+   触发 OnPlanStepComplete（status=Completed）
+                    触发 OnPlanStepComplete（status=Failed）
+                                  触发 OnPlanStepComplete（status=Skipped）
+```
+
+**设计约束**：
+- 状态转换**单向**（Pending → Running → 终态），不回退
+- 终态（Completed / Failed / Skipped）触发 `OnPlanStepComplete` hook
+- 重试由上层应用新建 `Plan` 实现（不在 `TaskAgent` 内做自动重试）
+
+#### 3.3.2 Session 状态
+
+`AgentSession` 的状态机比 `Step` 简单：
+
+```
+创建 (new) ──► turn_index=0 ──► submit_turn() ──► turn_index+=1 ──► ... ──► 销毁
+```
+
+`turn_index` 累加，`cost_so_far` 累加，无显式状态枚举（避免过度设计）。
+
+### 3.4 Hook 扩展设计
+
+在 `src/llm/hooks.rs` 中追加 3 个事件 + 2 个上下文字段：
+
+```rust
+pub enum HookEvent {
+    // ... 现有 4 个：PreRequest / PostRequest / OnRetry / OnError ...
+
+    // 新增 3 个（Phase 4）：
+    OnTurnStart,
+    OnTurnEnd,
+    OnPlanStepComplete,
+}
+
+pub struct HookContext {
+    // ... 现有字段 ...
+
+    // 新增 2 个（Phase 4）：
+    pub turn_index: Option<u32>,        // OnTurnStart / OnTurnEnd 用
+    pub plan_step_index: Option<usize>, // OnPlanStepComplete 用
+}
+```
+
+**设计意图**：
+- **不破坏现有 hook 兼容性**：3 个新事件是 enum 追加，2 个新字段是 `Option<T>` 默认 `None`
+- 上层应用可通过监听 `OnTurnEnd` 实现"独立 task 回写 ConversationMemory"——呼应"记忆在独立 task 处理"原则
+- `OnPlanStepComplete` 提供"步骤级别"的可观测性，与 Hermes 的"任务进度回调"对齐
+
+### 3.5 错误体系
+
+`AgentError` 与下层错误的关系：
+
+```
+┌──────────────────┐
+│   AgentError     │
+├──────────────────┤
+│  Llm(LlmError)   │──► 透传 Phase 0 错误，含 is_recoverable()
+│  Tool(ToolError) │──► 透传 Phase 2 错误，含 is_recoverable()
+│  Memory(MemoryError)│─► 透传 Phase 3 错误
+│  PlanParse(String)  │─► Agent 层特有
+│  HookBlocked(String)│─► Agent 层特有
+│  LimitExceeded(String)│► Agent 层特有
+│  Config(String)   │──► Agent 层特有
+│  Other(String)    │──► 兜底
+└──────────────────┘
+        │
+        ▼
+  is_recoverable(): 聚合判定
+  - Llm/Memory 可恢复（重试）
+  - PlanParse / Config 不可恢复（需人工介入）
+  - Tool / HookBlocked / LimitExceeded 按内层错误判定
+```
+
+**自动 From 转换**：通过 `#[from]` 宏实现 `From<LlmError>` / `From<ToolError>` / `From<MemoryError>`，让 `submit_turn` 内部可以用 `?` 运算符直接传播。
+
+### 3.6 与 Phase 0-3 模块的集成
+
+| Phase 4 组件 | 调用的下层 API | 调用位置 |
+|-------------|--------------|---------|
+| `AgentSession::submit_turn` | `LlmCycle::new` + `with_system_prompt` + `with_hook_executor` + `with_compact_config` + `with_messages` + `submit_with_tools` | session.rs |
+| `AgentSession::submit_turn` | `CostTracker::add`（累计 cost） | session.rs |
+| `RuntimeBundle::new` | `ToolRegistry::register`（注册 retrieve tool） | runtime.rs |
+| `TaskAgent::execute_plan` | `AgentSession::submit_turn`（每步调一次） | task.rs |
+| `JsonPlanParser::parse` | `serde_json::from_str` | task.rs |
+| `AgentError::from` | `LlmError` / `ToolError` / `MemoryError` | error.rs |
+| `HookContext` 扩展 | `HookEvent::OnTurnStart/End/OnPlanStepComplete` | llm/hooks.rs |
+
+**不调用的下层 API**（明确边界）：
+- ❌ `ConversationMemory`（由上层独立 task 管理）
+- ❌ `KnowledgeStore`（由上层独立 task 管理）
+- ❌ `McpClient`（已由 `ToolRegistry` 包装）
+- ❌ `StreamEvents::submit_stream`（v1 暂不暴露流式 `submit_turn`，v0.2 再说）
+
+## 4. 实施计划
+
+### 4.1 文件清单
+
+#### 新增文件（7 个）
+
+```
+src/agent.rs                       # 模块根 + pub use 重导出
+src/agent/agent.rs                 # Agent trait
+src/agent/runtime.rs               # RuntimeBundle + AgentConfig
+src/agent/session.rs               # AgentSession（含 submit_turn reference impl）
+src/agent/task.rs                  # TaskAgent trait + Plan/Step + PlanParser + JsonPlanParser
+src/agent/builder.rs               # AgentBuilder
+src/agent/error.rs                 # AgentError
+```
+
+#### 修改文件（3 个）
+
+```
+src/lib.rs                         # + pub mod agent;
+src/llm/hooks.rs                   # + 3 个事件变体 + 2 个 HookContext 字段
+docs/roadmap.md                    # Phase 4 状态 ❌ 缺失 → ✅
+```
+
+#### 关联文档（已完成 / 待写）
+
+```
+docs/note-agent-harness-references.md  # ✅ 已存在
+docs/note-agent-runtime-design.md      # ✅ 已存在（与本文件配套）
+docs/7-agent-runtime.md                # ✅ 本文件
+```
+
+### 4.2 任务拆解（按依赖顺序）
+
+| 顺序 | 任务 | 涉及文件 | 验证 |
+|------|------|---------|------|
+| 1 | 修改 `llm/hooks.rs` 追加 3 个事件 + 2 个字段 | `src/llm/hooks.rs` | `cargo build` 通过；现有测试不挂 |
+| 2 | 新建 `agent/error.rs` 定义 `AgentError` | `src/agent/error.rs` | `cargo build` 通过 |
+| 3 | 新建 `agent/agent.rs` 定义 `Agent` trait | `src/agent/agent.rs` | `cargo build` 通过 |
+| 4 | 新建 `agent/runtime.rs` 定义 `RuntimeBundle` + `AgentConfig` | `src/agent/runtime.rs` | `cargo build` 通过 |
+| 5 | 新建 `agent/builder.rs` 定义 `AgentBuilder` | `src/agent/builder.rs` | `cargo build` 通过 |
+| 6 | 新建 `agent/session.rs` 定义 `AgentSession` + `submit_turn` | `src/agent/session.rs` | `cargo build` 通过 |
+| 7 | 新建 `agent/task.rs` 定义 `TaskAgent` + `Plan` / `Step` / `PlanParser` / `JsonPlanParser` | `src/agent/task.rs` | `cargo build` 通过 |
+| 8 | 新建 `src/agent.rs` 模块根 + `pub use` 重导出 | `src/agent.rs` | `cargo build` 通过 |
+| 9 | 修改 `lib.rs` 导出 `pub mod agent;` | `src/lib.rs` | `cargo build` 通过 |
+| 10 | 编写 2-3 个烟雾测试 | `src/agent/*.rs` 内联 | `cargo test` 通过 |
+| 11 | 更新 `roadmap.md` 状态翻转 | `docs/roadmap.md` | 文档 review |
+| 12 | 完整 `cargo test` 跑全量回归 | — | 所有已有测试不挂 |
+
+### 4.3 依赖关系
+
+```
+hooks.rs (1) ──┐
+               ├──► agent/error.rs (2) ──► agent/agent.rs (3)
+               │                              │
+               │                              ▼
+               │                    agent/runtime.rs (4)
+               │                              │
+               │                              ▼
+               │                    agent/builder.rs (5)
+               │                              │
+               │                              ▼
+               │                    agent/session.rs (6)
+               │                              │
+               │                              ▼
+               └─────────────────► agent/task.rs (7)
+                                              │
+                                              ▼
+                                       src/agent.rs (8)
+                                              │
+                                              ▼
+                                       src/lib.rs (9)
+                                              │
+                                              ▼
+                                       cargo test (10)
+                                              │
+                                              ▼
+                                       roadmap.md (11)
+                                              │
+                                              ▼
+                                       回归 (12)
+```
+
+### 4.4 预估工作量
+
+| 阶段 | 行数 | 说明 |
+|------|------|------|
+| 1（hooks 扩展） | ~15 | 3 个变体 + 2 个字段 + 文档 |
+| 2-7（7 个 agent 文件） | ~600 | 含 import + trait + struct + impl + 文档 |
+| 8-9（lib.rs + agent.rs 模块根） | ~20 | 主要是 pub use 重导出 |
+| 10（烟雾测试） | ~100 | 2-3 个测试 |
+| 11（roadmap 同步） | ~5 | 状态翻转一行 |
+| **合计** | **~740** | 与 `note-agent-runtime-design.md` §6 预估一致 |
+
+## 5. 风险评估
+
+### 5.1 抽象化边界（核心风险）
+
+**风险描述**：Phase 4 容易"过度抽象"——参考了 OpenHarness / Hermes 后，倾向于把它们的核心能力都搬到 Rust core 库里。
+
+**缓解措施**：
+- 严格遵循 §1.3 的 7 条设计原则
+- 每次添加新 trait / struct 前，先问"这属于 core 库职责吗？"
+- 业务能力（Plan 拆解、多 Agent 协同、技能加载）一律走 trait 注入或 v0.2+ 延后
+
+### 5.2 对 Phase 0-3 的侵入风险
+
+**风险描述**：为实现 Phase 4 需修改 `src/llm/hooks.rs`，可能破坏 Phase 0 的现有测试。
+
+**缓解措施**：
+- 只追加 enum 变体和 `Option<T>` 字段（NF9）
+- 顺序：先跑 `cargo test` 确认 Phase 0 测试不挂，再开始 Phase 4
+- 详细回归验证：实施完毕后跑全量 `cargo test`
+
+### 5.3 参考项目语言差异
+
+**风险描述**：OpenClaw / Hermes / OpenHarness 均为 Python/TypeScript，OpenHuman 虽是 Rust + Tauri 但定位是桌面应用。直接照搬接口形状可能导致 Rust 借用检查问题、async 复杂度增加。
+
+**缓解措施**：
+- §1.3 明确"借鉴不照搬"
+- 反模式列表（见 `docs/note-agent-harness-references.md` §6）作为排除项
+- 接口设计优先考虑 Rust 惯例（`Arc<dyn Trait>` / `async fn` / `Result<T, E>`）
+
+### 5.4 trait 设计的稳定性风险
+
+**风险描述**：Phase 4 是 v0.1 的第一个"复杂 trait 集合"，如果 trait 形状不稳定，v0.2+ 添加新能力时会 breaking。
+
+**缓解措施**：
+- §3.2 的所有 trait / struct 在 `docs/note-agent-runtime-design.md` §4 已固化草案
+- 实施时如需调整，应先更新决策记录再改代码
+- 预留扩展点：`Agent::tool_definitions` 的默认实现可被子 trait 覆盖
+
+### 5.5 实施进度风险
+
+**风险描述**：12 项交付物虽然不多，但 `submit_turn` 的 reference impl 需要在"LlmCycle 之上做正确组装"，容易卡在细节。
+
+**缓解措施**：
+- 任务拆解（§4.2）按依赖顺序排好
+- 烟雾测试只验证"能跑通"不验证"业务正确"——避免陷入业务循环的细节
+- 必要时先做 `MockProvider`（Phase 0 已有模式），不依赖真实 LLM
+
+## 6. 验收标准
+
+### 6.1 代码验收
+
+- [ ] `cargo build --release` 0 错误 0 警告（clippy）
+- [ ] `cargo test` 所有 Phase 0-3 已有测试 + Phase 4 新增测试全部通过
+- [ ] `cargo doc --no-deps` 所有公开 API 有 `///` 文档注释
+- [ ] 新增代码 700-750 行（含测试 + 文档注释），与 §4.4 预估一致
+- [ ] `src/lib.rs` 新增一行 `pub mod agent;`
+- [ ] `src/llm/hooks.rs` 仅追加（不修改现有变体或字段）
+
+### 6.2 接口验收
+
+- [ ] 7 个新文件全部存在（§4.1）
+- [ ] `Agent` trait 包含 `name` / `system_prompt` / `tool_definitions` 三个方法
+- [ ] `RuntimeBundle` 包含 6 个字段（provider / tool_registry / hook_executor / memory_store? / retriever? / config）
+- [ ] `AgentSession::submit_turn` 实现约 30 行，含 OnTurnStart/End hook 触发
+- [ ] `TaskAgent` 提供双入口 `run` + `execute_plan`
+- [ ] `JsonPlanParser` 实现约 20 行，基于 `serde_json`
+- [ ] `AgentError` 聚合 8 个变体，含 `is_recoverable()`
+- [ ] `AgentBuilder` 提供 6 个 setter + `build()` 校验
+- [ ] `HookEvent` 新增 3 个变体：`OnTurnStart` / `OnTurnEnd` / `OnPlanStepComplete`
+- [ ] `HookContext` 新增 2 个 `Option` 字段：`turn_index` / `plan_step_index`
+
+### 6.3 测试验收
+
+至少 2-3 个烟雾测试通过：
+
+- [ ] **测试 1**：`Agent` trait 可实现 + `RuntimeBundle` 可构造（builder 链式调用）
+- [ ] **测试 2**：`AgentSession::submit_turn` 跑通 mock provider（Phase 0 `MockProvider` 模式）
+- [ ] **测试 3（可选）**：`JsonPlanParser::parse` 能解析合法 JSON，失败时返回 `AgentError::PlanParse`
+
+### 6.4 文档验收
+
+- [ ] `docs/7-agent-runtime.md`（本文件）完整、6 段式结构齐备
+- [ ] `docs/note-agent-runtime-design.md` 与本文件互相引用一致
+- [ ] `docs/note-agent-harness-references.md` 与本文件互相引用一致
+- [ ] `docs/roadmap.md` Phase 4 状态从 ❌ 缺失 改为 ✅，交付物清单更新
+
+### 6.5 行为验收（人工 review）
+
+- [ ] `AgentSession::submit_turn` 不持有 `ConversationMemory`（grep 验证无 `use crate::memory::ConversationMemory`）
+- [ ] `RuntimeBundle::new` 当 `retriever` 为 `Some` 时自动注册 `"retrieve"` tool
+- [ ] `AgentBuilder::build` 在必填字段缺失时返回 `AgentError::Config`（而非 panic）
+- [ ] `AgentError::is_recoverable()` 对各变体返回正确分类
+
+### 6.6 风险验收
+
+- [ ] 5.1 抽象化边界：trailt 列表中**不包含** Multi-Agent / Skills / TUI / Gateway 等应用层能力
+- [ ] 5.2 Phase 0-3 侵入：`git diff` 显示 `src/llm/hooks.rs` 仅追加
+- [ ] 5.3 语言差异：trait 形状符合 Rust 惯例（无 Python 风格的复杂继承）
+- [ ] 5.4 trait 稳定性：决策记录与最终代码一致
+- [ ] 5.5 实施进度：实际工作量与 §4.4 预估偏差 < 30%
+
+## 7. 一句话总结
+
+> **Phase 4 = 1 个 trait（Agent）+ 1 个容器（RuntimeBundle）+ 1 个会话（AgentSession）+ 1 个任务抽象（TaskAgent）+ 4 个辅助组件（Builder / Error / PlanParser / Hook 扩展），约 740 行代码，把 Phase 0-3 已有能力"装配"成"智能体"的概念。**