diff --git a/docs/5-tool-system.md b/docs/5-tool-system.md
index fbe058f..26e170a 100644
--- a/docs/5-tool-system.md
+++ b/docs/5-tool-system.md
@@ -33,6 +33,8 @@ AG Core Phase 0（Foundation）已完成 LLM 调用周期基础设施，Phase 1
 - MCP 客户端超时默认 30 秒，可配置
 - 自定义工具与 MCP 工具通过同一 `ToolRegistry` 管理，对 LlmCycle 透明
 - 权限检查在工具执行之前，阻断后返回错误而非静默跳过
+- `BaseTool::execute()` 签名必须预留扩展点（`ToolContext` 注入），确保未来 Skill/Agent 层可在不修改 trait 签名的情况下注入 session_id、cancellation_token 等上下文信息
+- 自动 tool 循环应考虑 token 消耗——工具定义随每轮请求重复发送，工具结果直接追加到对话历史，需提供结果大小限制和截断策略
 
 ---
 
@@ -61,11 +63,11 @@ pub mod mcp;
 pub mod permission;
 pub mod registry;
 
-pub use base::BaseTool;
+pub use base::{BaseTool, ToolContext};
 pub use error::ToolError;
 pub use mcp::McpClient;
 pub use permission::{Permission, PermissionChecker, PermissionConfig};
-pub use registry::{ToolInvocation, ToolRegistry};
+pub use registry::{ToolEntry, ToolInvocation, ToolRegistry};
 ```
 
 `lib.rs` 添加：
@@ -182,6 +184,7 @@ impl ToolRegistry {
 **核心逻辑**：
 - `invoke()`：查找工具 → 权限检查 → 执行 → 返回 `ToolInvocation`
 - `invoke_all()`：对多个工具调用并行执行（使用 `tokio::join!` 或 `futures::join_all`），适用于 LLM 同时发出多个 tool_calls 的场景
+- `invoke_all()` 应对每个工具执行添加超时控制（通过 `tokio::time::timeout`），超时时间由 `CycleConfig::tool_timeout_secs` 配置，默认 60 秒，防止单个工具长时间阻塞整个循环
 - `definitions()`：将注册的工具批量转换为 `Vec<ToolDefinition>`，供 LlmCycle 传递 LLM
 - `ToolRegistry` 不持有 `PermissionChecker` 的生命周期（使用 `Arc`），允许多个 Registry 共享同一个 Checker
 
@@ -269,7 +272,9 @@ impl PermissionChecker {
 
 ### 4. McpClient — MCP 协议客户端
 
-MCP（Model Context Protocol）是一种基于 JSON-RPC 的协议，用于 LLM 与外部工具系统通信。Phase 2 实现其 **最小可行子集**，专注于 stdio transport。
+MCP（Model Context Protocol）是一种基于 JSON-RPC 的协议，用于 LLM 与外部工具系统通信。Phase 2 实现其 **最小可行子集**，优先实现 stdio transport。
+
+> **传输方式说明**：MCP 协议版本 2025-03-26 定义了两种标准传输——`stdio` 和 `Streamable HTTP`。原有的 `HTTP+SSE` 传输（2024-11-05）已被官方废弃，新实现不应采用。`Streamable HTTP` 通过单一 HTTP 端点同时支持 JSON 响应和 SSE 流式升级，是 HTTP 场景的推荐方案。
 
 ```rust
 // tools/mcp.rs
@@ -295,8 +300,12 @@ pub enum McpTransport {
         command: String,
         args: Vec<String>,
     },
-    // /// SSE（Server-Sent Events）传输（未来支持）。
-    // Sse { url: String },
+    /// Streamable HTTP 传输（MCP 2025-03-26 引入，替代已废弃的 HTTP+SSE）。
+    /// 客户端通过单一 HTTP 端点与 MCP Server 通信，支持 JSON 和 SSE 流式响应。
+    StreamableHttp {
+        url: String,
+        headers: Option<Vec<(String, String)>>,
+    },
 }
 
 /// MCP 客户端 —— 与 MCP 服务器通信。
@@ -379,6 +388,11 @@ impl McpClient {
 - MCP Server 由外部提供（如 `npx @anthropic/mcp-server-filesystem`）
 - 用户需要提供 MCP Server 的启动命令和参数
 
+**工具缓存说明**：
+- `McpClient` 在 `list_tools()` 时缓存工具列表，避免每次调用都重新请求
+- 缓存假设：MCP Server 的工具列表在运行时不会频繁变更（如插件式加载场景除外）
+- 如需刷新，可通过新增 `refresh_tools()` 方法或基于 TTL（如 60 秒）自动失效
+
 ### 5. ToolError — 错误类型
 
 ```rust
@@ -441,7 +455,7 @@ impl LlmCycle {
         registry: &ToolRegistry,
     ) -> Result<ChatResponse, LlmError> {
         let tools = registry.definitions();
-        let max_turns = self.config.max_turns.unwrap_or(10);
+        let max_turns = self.config.max_turns.unwrap_or(10); // 注：CycleConfig.max_turns 默认值为 None，实现时需修改 Default 为 Some(10)
         let mut turn = 0;
 
         self.messages.push(OpenaiChatMessage::user_text(prompt));
@@ -502,11 +516,66 @@ impl LlmCycle {
 | 决策 | 选择 | 理由 |
 |------|------|------|
 | 循环方式 | 同步循环（单线程串行） | 工具执行依赖前一轮结果，串行更安全 |
-| 最大轮次 | `CycleConfig.max_turns`，默认 10 | 防止无限循环（LLM 反复调用工具） |
+| 最大轮次 | `CycleConfig.max_turns`，默认 `Some(10)` | 防止无限循环（LLM 反复调用工具）。**注意**：当前 `CycleConfig` 默认值为 `None`，实现时需将 `Default` 改为 `Some(10)` |
 | 工具并行 | `invoke_all()` 互不依赖的工具并行 | LLM 可能一次发出多个 tool_calls（parallel_tool_calls） |
+| 工具超时 | `CycleConfig::tool_timeout_secs`，默认 60 | 防止单个工具长时间阻塞循环。`invoke_all()` 使用 `tokio::time::timeout` 包装 |
 | 错误处理 | 工具执行错误以文本回传 LLM，而非终止循环 | LLM 可自行从错误中恢复 |
 | 消息追踪 | 所有工具交互通过 `self.messages` 持久化 | 调用方能通过 `cycle.messages()` 查看完整轨迹 |
 
+**Token 消耗分析**：
+
+自动 tool 循环的 token 消耗主要来自三个来源：
+
+| 来源 | 说明 | 影响程度 |
+|------|------|---------|
+| 工具定义重复发送 | `definitions()` 在每轮请求中携带全部工具的 JSON Schema | 注册工具数 × 平均定义大小 × 轮数。20 个工具 × 500B × 5 轮 ≈ 50KB 输入 token |
+| 工具结果追加历史 | 每次工具执行结果完整追加到 `messages`，后续请求重发全部历史 | 最显著的 token 泄漏源。大结果（如向量搜索 Top-50）单次可能 ~15KB，多轮累加 |
+| Value→String 序列化 | 工具结果 `serde_json::to_string()` 后 JSON 字符串膨胀 ~20-30% | 线性的常量损耗 |
+
+**影响估算**：
+
+| 场景 | 工具相关 token 占比 | 说明 |
+|------|-------------------|------|
+| 单次简单查询 | <5% | 可忽略 |
+| 文件读取+分析（3-4 轮） | ~30% | 工具结果逐步累积 |
+| 网页搜索+总结（3-5 轮） | ~40% | 工具结果包含页面内容 |
+| 多工具数据 pipeline（5-10 轮） | ~60%+ | 需关注压缩和限制策略 |
+
+**缓解方向**（Phase 2 不强制实现，但设计需可扩展）：
+- **结果大小限制**：工具执行结果超过阈值时自动截断（如 `CycleConfig::max_tool_result_bytes`）
+- **自动压缩**：现有的 Auto-compaction 需感知工具消息，避免压缩掉 LLM 后续依赖的数据
+- **工具定义缓存**：基础工具定义变化极少，未来可考虑客户端侧缓存（需等 provider 支持）
+
+**错误分类与处理策略**：
+
+工具执行错误需要区分"可恢复"和"不可恢复"两类，不可恢复的错误应终止循环而非回传 LLM：
+
+| 错误类型 | 处理策略 | 理由 |
+|---------|---------|------|
+| `ToolError::ExecutionFailed` | 回传 LLM（文本） | LLM 可能下次换参数或换方式重试 |
+| `ToolError::InvalidArguments` | 回传 LLM（文本） | LLM 可自动修正参数 |
+| `ToolError::NotFound` | 终止循环，返回 `LlmError` | LLM 无法注册工具，重试无意义 |
+| `ToolError::PermissionDenied` | 终止循环，返回 `LlmError` | 安全敏感，不应允许重试 |
+| `ToolError::McpError` | 终止循环，返回 `LlmError` | MCP 链路故障，重试大概率失败 |
+| `ToolError::McpTimeout` | 终止循环，返回 `LlmError` | 或可考虑重试 1 次后终止 |
+| `ToolError::Io` | 终止循环，返回 `LlmError` | IO 错误通常是环境问题 |
+| `ToolError::Other` | 回传 LLM（文本） | 兜底，保守回传 |
+
+实现上可在 `ToolError` 上添加 `is_recoverable()` 方法，或在 `submit_with_tools()` 中通过 `match` 分支判断。
+
+**submit_request() 重构说明**：
+
+提取 `submit_request()` 作为 `submit_with_tools()` 的内部方法时，需确保不影响现有方法的行为。重构后的方法职责矩阵：
+
+| 方法 | Push user msg | Compaction | Retry | Call provider | Handle response |
+|------|:---:|:---:|:---:|:---:|:---:|
+| `submit()` | ✅ | ✅ | ✅ | → `submit_request()` | ✅ |
+| `submit_messages()` | ❌ | ✅ | ✅ | → `submit_request()` | ✅ |
+| `submit_with_tools()` | ✅ | ✅ | ✅ | → `submit_request()` | ✅* |
+| `submit_request()` | ❌ | ❌ | ✅ | ✅ | ✅ |
+
+*`submit_with_tools()` 在 `submit_request()` 返回后额外检查 `ToolCalls`，执行工具后递归调用自身。
+
 **流式模式支持**：
 
 `submit_stream()` 的增强方案：新增 `submit_stream_with_tools()`，在流式事件层面支持自动 tool 循环。
@@ -521,7 +590,7 @@ impl LlmCycle {
         // 1. 使用 submit_stream() 获取初始事件流
         // 2. 监听 TurnComplete { reason: ToolCalls }
         // 3. 触发时：通过 ToolRegistry 执行工具
-        // 4. 发射 ToolExecutionCompleted 事件
+        // 4. 发射 ToolExecutionCompleted 事件（由 submit_stream_with_tools 负责，非底层 stream parser）
         // 5. 将工具结果注入 messages
         // 6. 自动发起下一轮请求（递归）
         // 7. 直到 finish_reason 为 Stop
@@ -533,24 +602,27 @@ impl LlmCycle {
 ```
 submit_stream_with_tools("查天气")
   │
-  ├─ AssistantTextDelta "我来查一下北京的天气..."
-  ├─ ToolExecutionStarted { tool_name: "get_weather", input: {city:"北京"}, id:"call_1" }
-  ├─ TurnComplete { reason: ToolCalls }
+  ├─ AssistantTextDelta "我来查一下北京的天气..."      ← 底层 stream parser 发射
+  ├─ ToolExecutionStarted { tool_name, input, id }     ← submit_stream_with_tools 发射
+  ├─ TurnComplete { reason: ToolCalls }                 ← 底层 stream parser 发射
   │
   ├── [自动] 执行工具 get_weather({city:"北京"})
   │
-  ├─ ToolExecutionCompleted { tool_name: "get_weather", output: {temp:22}, is_error:false }
+  ├─ ToolExecutionCompleted { tool_name, output, ... }  ← submit_stream_with_tools 发射
   │
-  ├─ AssistantTextDelta "北京今天 22°C"
-  ├─ TurnComplete { reason: Stop }
+  ├─ AssistantTextDelta "北京今天 22°C"                ← 底层 stream parser 发射
+  ├─ TurnComplete { reason: Stop }                     ← 底层 stream parser 发射
   │
   └─ (流结束)
+
+**事件发射职责划分**：底层 `parse_chunk_stream()` 负责 LLM 原生事件（`AssistantTextDelta`、`TurnComplete`）；`submit_stream_with_tools()` 负责工具层事件（`ToolExecutionStarted`、`ToolExecutionCompleted`），在工具执行前/后手动 `yield` 事件。
 ```
 
 ### 7. 自定义工具示例
 
 ```rust
-use agcore::tools::prelude::*;
+use agcore::tools::{BaseTool, ToolError};
+use async_trait::async_trait;
 use serde_json::Value;
 
 struct WeatherTool;
@@ -598,6 +670,131 @@ tools/ 模块内部依赖：
 
 ---
 
+### 9. 未来工具化路线扩展性分析
+
+> 本节回答"当前设计是否足以支撑未来常规工具、MCP、Skill、记忆等统一走工具调用路线"。
+
+#### 设计目标
+
+未来所有 Agent 可调用的能力（常规工具、MCP 工具、Skill、记忆操作）都应通过 `BaseTool` trait 统一暴露给 LLM，`ToolRegistry` 作为唯一的工具发现和调用入口，对 `LlmCycle` 透明。
+
+#### 各场景支持度评估
+
+| 场景 | 当前支持度 | 关键瓶颈 |
+|------|-----------|---------|
+| 常规工具（天气/计算器） | ✅ 直接可行 | 无 |
+| MCP 工具（McpClient→BaseTool 适配器） | ✅ 可行 | 适配器模式优雅，MCP 流式/进度能力被 `Value→Value` 约束 |
+| Memory CRUD（store/recall/forget/update） | ⚠️ 基本可行 | 检索分页、大量结果返回需额外处理 |
+| 长时运行工具（数据集查询、文件上传） | ❌ 不可行 | 无进度汇报、无 cancellation 机制 |
+| 多轮确认工具（"是否冻结账户？"审批流程） | ❌ 不可行 | 单次调用→单次返回，无法表达"反问→确认"模式 |
+| Skill 编排（多步骤组合、嵌套执行） | ❌ 不可行 | 无上下文传播（跨步骤传递中间结果）、无工具组合原语 |
+| Agent 按场景筛选工具子集 | ⚠️ 部分可行 | 无 tag/category 筛选机制 |
+
+#### 关键扩展点
+
+**A. `BaseTool::execute()` 签名——预留 `ToolContext` 注入**
+
+`BaseTool` 是公开 trait，一旦用户实现并发布 crate，后续 breaking change 成本极高。当前签名：
+
+```rust
+async fn execute(&self, args: Value) -> Result<Value, ToolError>;
+```
+
+未来扩展路径——新增 `ToolContext` 参数，携带执行上下文：
+
+```rust
+async fn execute(&self, args: Value, ctx: &ToolContext<'_>) -> Result<Value, ToolError>;
+```
+
+`ToolContext` 初始应包含的字段（Phase 2 实现时不必全部实现，但签名需预留参数位置）：
+
+| 字段 | 用途 | 引入阶段 |
+|------|------|---------|
+| `session_id: &str` | 追踪一次对话中所有工具调用的关联性 | Phase 2 |
+| `trace_id: &str` | 链路追踪，跨工具调用的耗时分布 | Phase 2 |
+| `cancellation_token: CancellationToken` | 优雅取消正在执行的工具 | Phase 2 |
+| `progress: Option<UnboundedSender<ProgressEvent>>` | 进度汇报（数据处理到 50%） | Phase 3 |
+| `shared_state: Option<&HashMap<String, Value>>` | Skill 跨步骤传递中间结果 | Phase 4 |
+
+这样 Skill/Agent 层在 Phase 4 引入时，`execute` 签名不必改，只需在 `ToolContext` 中增加字段。
+
+**B. `ToolRegistry` 内部结构——引入 `ToolEntry` 元数据**
+
+当前内部是 `HashMap<String, Arc<dyn BaseTool>>`，未来扩展为：
+
+```rust
+pub struct ToolEntry {
+    pub tool: Arc<dyn BaseTool>,
+    pub tags: Vec<String>,
+    pub category: String,        // "memory", "data", "communication" 等
+    pub version: Option<String>,
+    pub stats: ToolStats,        // 调用次数、平均耗时
+}
+```
+
+对应的筛选 API：
+
+```rust
+pub fn find_by_tag(&self, tag: &str) -> Vec<&ToolEntry>;
+pub fn find_by_category(&self, category: &str) -> Vec<&ToolEntry>;
+pub fn groups(&self) -> HashMap<&str, Vec<&ToolEntry>>;
+```
+
+**C. 工具返回模式——从单一 `Value` 到 `ToolOutput` 枚举**
+
+当前返回类型 `Result<Value, ToolError>` 只能表达"一次性完整返回"。未来根据需要引入多模式输出：
+
+```rust
+pub enum ToolOutput {
+    /// 一次性返回完整结果
+    Final(Value),
+    /// 通过 channel 逐步流式输出结果
+    Streamed { initial: Value, rx: Receiver<Value> },
+    /// 需要 LLM 进一步确认后再继续
+    AwaitingInput { context: Value, prompt: String },
+}
+```
+
+| 返回模式 | 场景示例 |
+|---------|---------|
+| `Final(Value)` | 天气查询、文件读取 |
+| `Streamed { initial, rx }` | 向量搜索 Top-100 逐批返回 |
+| `AwaitingInput { context, prompt }` | "检测到可疑交易，是否冻结？" |
+
+#### 各能力的引入时序
+
+```
+Phase 2（当前实现）
+  ├─ BaseTool trait (Value→Value, 但签名预留 Context 参数位)
+  ├─ ToolRegistry (HashMap<String, ToolEntry> + tag/category 筛选)
+  ├─ PermissionChecker / McpClient / ToolError
+  ├─ submit_with_tools() / submit_stream_with_tools()
+  └─ ToolContext { session_id, trace_id, cancellation_token }
+
+Phase 3（Memory 工具化）
+  ├─ MemoryStore trait（扩展 BaseTool）
+  ├─ memory_store / memory_recall / memory_search 等作为工具注册
+  └─ ToolContext.progress 支持（分批返回检索结果）
+
+Phase 4（Agent + Skill + 编排）
+  ├─ ToolContext.shared_state 支持（跨步骤传递中间结果）
+  ├─ ToolOutput 枚举支持（如需要流式/确认模式）
+  ├─ ToolChain / ToolSelector 工具组合原语
+  └─ Skill 机制（多步骤编排 + 内部状态）
+```
+
+#### 已识别但推迟的设计决策
+
+| 决策 | 推迟原因 | 何时需要 |
+|------|---------|---------|
+| `ToolOutput` 枚举 | Phase 2 的所有场景（常规工具/MCP）用 `Value` 足够 | Phase 4 Agent 编排或长时工具 |
+| 工具 DAG 调度 | Agent 场景后才需要复杂编排 | Phase 4 |
+| Skill 机制 | 需要先有 Agent 使用工具的实践经验 | Phase 4 |
+| 工具调用审计持久化 | 可先通过 Hook 点实现简单日志 | Phase 4 |
+| 用户授权（运行时弹窗确认） | `PermissionChecker` 只做静态策略判定，不处理运行时交互。用户授权属于交互流程，应作为 `ToolOutput::AwaitingInput` 由上层 UI/Agent 层实现 | Phase 4 |
+
+---
+
 ## 实现计划
 
 ### Step 1: 创建方案文档
@@ -622,6 +819,7 @@ tools/ 模块内部依赖：
 
 - 创建 `src/tools/base.rs`
 - 定义 `BaseTool` trait（name / description / parameters / required_permissions / execute）
+- 定义 `ToolContext` 结构体（session_id / trace_id / cancellation_token），注入 `execute()` 作为第二个参数
 - 创建 `src/tools.rs` 模块根，声明子模块，重导出公共 API
 - `lib.rs` 添加 `pub mod tools;`
 - 编写 1 个 MockTool 测试工具并验证 trait 实现
@@ -630,8 +828,8 @@ tools/ 模块内部依赖：
 ### Step 5: ToolRegistry
 
 - 创建 `src/tools/registry.rs`
-- 定义 `ToolInvocation` 结构体 + `ToolRegistry`
-- 实现核心方法：register / get / list / definitions / invoke / invoke_all
+- 定义 `ToolInvocation` 结构体 + `ToolEntry` 元数据包装（tool + tags + category + stats）+ `ToolRegistry`
+- 实现核心方法：register / get / list / definitions / invoke / invoke_all / find_by_tag / find_by_category
 - `invoke_all()` 使用 `futures::future::join_all` 并行执行互不依赖的工具
 - `definitions()` 将 `HashMap` 中的工具转换为 `Vec<ToolDefinition>`
 - 编写 8+ 测试覆盖：注册冲突、空注册表查找、单次调用、批量并行调用、工具执行失败
@@ -648,7 +846,8 @@ tools/ 模块内部依赖：
 - 实现 `submit_stream_with_tools()` 方法：
   - 组合流式事件流和自动 tool 循环
   - 在 TurnComplete(ToolCalls) 后发射 ToolExecutionCompleted
-- 更新 `CycleConfig` 文档注释
+- 更新 `CycleConfig` 文档注释，新增 `tool_timeout_secs` 字段，默认值 60
+- 将 `CycleConfig::max_turns` 默认值由 `None` 改为 `Some(10)`
 - 编写 3+ 集成测试：单轮 tool 调用、多轮 tool 调用、达到 max_turns 终止
 - 运行 `cargo test` 验证
 
@@ -661,6 +860,7 @@ tools/ 模块内部依赖：
   - `list_tools()`：调用 tools/list，缓存结果
   - `call_tool()`：调用 tools/call，解析响应
   - `close()`：发送 shutdown 请求，终止子进程
+- `StreamableHttp` transport 预留枚举变体，当前返回 "not implemented" 错误，不在 Phase 2 实现
 - 实现 `into_tools()`：将 MCP 工具转换为 `Vec<Arc<dyn BaseTool>>` 适配器
 - 设置 30 秒默认超时
 - 编写 MCP 协议消息序列化/反序列化测试 + 模拟子进程集成测试
@@ -697,8 +897,8 @@ tools/ 模块内部依赖：
 |------|------|---------|
 | MCP 协议规范变化 | 中 | 只实现最小子集（initialize/list_tools/call_tool），封装在 `mcp.rs` 中便于适配 |
 | MCP 子进程异常退出 | 中 | 实现超时机制 + 错误恢复；进程退出时自动标记为不可用 |
-| 工具执行死循环（LLM 反复调用工具） | 中 | `max_turns` 硬限制 + 检测重复调用模式 |
-| JSON-RPC 消息竞争（stdio 双工） | 低 | 请求和响应通过 `id` 字段匹配，使用 `Mutex` 保护写操作 |
+| 工具执行死循环（LLM 反复调用工具） | 中 | `max_turns` 硬限制，达到上限后终止循环 |
+| JSON-RPC 消息竞争（stdio 双工） | 中 | 请求和响应通过 `id` 字段匹配，使用 `Mutex` 保护写操作 + `HashMap<u64, OneshotSender>` 等待响应，实现复杂度高于接口示意 |
 | 权限配置过于复杂 | 低 | PermissionConfig 提供合理默认值（允许 Read/Network，拒绝 Delete/Shell），简单场景无需自定义 |
 | 工具调用参数类型不匹配 | 低 | `execute()` 接收 `Value`，由实现方自行校验；通过 `ToolError::InvalidArguments` 返回结构化错误 |
 
@@ -725,3 +925,4 @@ tools/ 模块内部依赖：
 17. `McpClient::into_tools()` 能生成可供 `ToolRegistry` 注册的适配器
 18. 所有新公开 API 有文档注释
 19. 测试覆盖率：`cargo test` 全部通过
+20. `BaseTool::execute()` 签名通过 `ToolContext` 参数预留了扩展点（session_id、cancellation_token），未来 Skill/Agent 层可在不修改 trait 签名的情况下注入上下文