docs(llm): 添加 LLM 调用周期控制实施方案

新增文档 `specs/llm-call-lifecycle.md`，详细描述了 LLM 调用周期控制的目标、范围、模块设计、依赖及测试计划。
2026-05-07 10:41:54 +08:00
parent 2b523a3e72
commit 9145b4d24f
1 changed files with 233 additions and 0 deletions
@@ -0,0 +1,233 @@
+# LLM 调用周期控制 — 实施方案
+
+> 参考实现: [HKUDS/OpenHarness](https://github.com/HKUDS/OpenHarness)
+
+## 目标
+
+实现大模型基础调用周期控制，作为 agcore 的核心底层件。
+
+## 范围
+
+- 仅支持 OpenAI-compatible API (`POST /v1/chat/completions`)
+- 仅非流式调用（后续可扩展流式）
+- 支持传入 tool definitions 和解析 tool_use response，但**不含 tool 自动执行循环**
+- 单次请求-响应周期控制
+
+## 领域模块结构
+
+所有 LLM 调用周期相关代码归入 `llm` 领域目录，未来其他功能（工具、记忆、提示词等）以同样方式组织。
+
+```
+src/
+  lib.rs                # crate 根
+  llm.rs                # mod llm — 领域根（声明 + 重导出）
+  llm/
+    types.rs            # llm::types — Message, ContentBlock, ChatRequest/Response, ToolDefinition
+    error.rs            # llm::error — LlmError
+    provider.rs         # llm::provider — LlmProvider trait（仅接口）
+    provider/
+      openai.rs         # llm::provider::openai — OpenaiProvider 实现
+    cycle.rs            # llm::cycle — 生命周期引擎（子模块根）
+    cycle/
+      retry.rs          # llm::cycle::retry — 重试策略
+      usage.rs          # llm::cycle::usage — Token 用量
+
+# 未来领域示例（占位）：
+# tools.rs + tools/        # 工具调用、MCP
+# memory.rs + memory/      # 记忆系统
+# prompt.rs + prompt/      # 提示词工程
+# agent.rs + agent/        # Agent 运行时
+```
+
+`llm.rs` 根模块声明:
+
+```rust
+// llm.rs
+pub mod types;
+pub mod error;
+pub mod provider;
+pub mod cycle;
+```
+
+## 模块设计
+
+### 1. llm/types.rs — 核心数据类型
+
+```rust
+pub enum Role { User, Assistant, System, Tool }
+
+pub enum ContentBlock {
+    Text { text: String },
+    ToolUse { id: String, name: String, input: serde_json::Value },
+    ToolResult { tool_use_id: String, content: String, is_error: bool },
+}
+
+pub struct Message { pub role: Role, pub content: Vec<ContentBlock> }
+pub struct ToolDefinition { pub name: String, pub description: String, pub input_schema: Value }
+
+pub struct ChatRequest {
+    pub model: String,
+    pub messages: Vec<Message>,
+    pub system_prompt: Option<String>,
+    pub tools: Vec<ToolDefinition>,
+    pub max_tokens: Option<u32>,
+    pub temperature: Option<f32>,
+}
+
+pub struct ChatResponse {
+    pub message: Message,
+    pub usage: Usage,
+    pub stop_reason: Option<StopReason>,
+}
+
+pub enum StopReason { Stop, ToolUse, MaxTokens, ContentFilter, Other(String) }
+```
+
+### 2. llm/error.rs — 错误体系
+
+```rust
+#[derive(thiserror::Error)]
+pub enum LlmError {
+    Authentication(String),
+    RateLimit { retry_after: Option<Duration> },
+    Request { status: u16, body: String },
+    Timeout { duration: Duration },
+    Stream(String),
+    ContextLength { actual: u32, limit: u32 },
+}
+```
+
+### 3. llm/provider.rs — Provider 接口
+
+trait 单独存放，具体实现在 `provider/` 子模块。
+
+```rust
+// llm/provider.rs
+pub mod openai;
+
+#[async_trait]
+pub trait LlmProvider: Send + Sync {
+    async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, LlmError>;
+}
+```
+
+#### 3.1 llm/provider/openai.rs — OpenAI 兼容实现
+
+```rust
+// llm/provider/openai.rs
+use super::LlmProvider;
+
+pub struct OpenaiProvider {
+    http_client: reqwest::Client,
+    base_url: String,
+    api_key: String,
+    model: String,
+}
+
+impl LlmProvider for OpenaiProvider {
+    async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, LlmError> {
+        // POST {base_url}/chat/completions
+        // 解析 response → ChatResponse
+        todo!()
+    }
+}
+```
+
+后续新增实现: `provider/anthropic.rs`、`provider/azure.rs` 等。
+
+### 4. llm/cycle.rs — 生命周期引擎
+
+```rust
+mod retry;
+mod usage;
+
+pub use retry::RetryConfig;
+pub use usage::{CostTracker, Usage};
+
+pub struct CycleConfig {
+    pub model: String,
+    pub max_tokens: Option<u32>,
+    pub temperature: Option<f32>,
+    pub max_turns: Option<u32>,
+    pub retry: RetryConfig,
+}
+
+pub struct LlmCycle {
+    provider: Box<dyn LlmProvider>,
+    config: CycleConfig,
+    usage: CostTracker,
+    messages: Vec<Message>,
+    system_prompt: Option<String>,
+}
+```
+
+`submit()` 完整流程:
+
+```
+submit(prompt, tools)
+  │
+  ├─ ① push Message(user, [Text(prompt)])
+  ├─ ② 构建 ChatRequest { messages, system, tools, max_tokens, temperature }
+  ├─ ③ [重试循环] provider.chat(request)
+  │     ├─ Ok  → 解析 ChatResponse
+  │     └─ Err(可重试) → compute_delay → sleep → retry
+  ├─ ④ push Message(assistant, [Text(...) | ToolUse(...)])
+  ├─ ⑤ usage.add(response.usage)
+  └─ ⑥ return ChatResponse
+```
+
+#### 4.1 llm/cycle/retry.rs — 重试策略
+
+```rust
+pub struct RetryConfig {
+    pub max_retries: u32,        // 默认 3
+    pub base_delay: Duration,    // 默认 1s
+    pub max_delay: Duration,     // 默认 30s
+    pub jitter_factor: f64,      // 默认 0.25
+}
+```
+
+指数退避 + jitter: `delay = min(base * 2^attempt, max_delay) + random(0, delay * jitter_factor)`
+
+可重试错误: RateLimit, Timeout, 5xx
+
+#### 4.2 llm/cycle/usage.rs — Token 用量
+
+```rust
+#[derive(Default)]
+pub struct Usage { pub input_tokens: u32, pub output_tokens: u32 }
+
+pub struct CostTracker { accumulated: Usage }
+impl CostTracker {
+    pub fn add(&mut self, usage: &Usage);
+    pub fn total(&self) -> &Usage;
+    pub fn reset(&mut self);
+}
+```
+
+## 依赖
+
+```toml
+[dependencies]
+tokio = { version = "1", features = ["full"] }
+reqwest = { version = "0.12", features = ["json"] }
+serde = { version = "1", features = ["derive"] }
+serde_json = "1"
+thiserror = "2"
+async-trait = "0.1"
+tracing = "0.1"
+```
+
+## 测试
+
+- Unit: types 序列化、retry 退避计算、usage 累计
+- Mock: HTTP mock server 测试 provider 请求/响应/错误处理
+- Integration (可选): Ollama 本地真实调用验证
+
+## 后续扩展
+
+- 流式接口 (`Stream<CycleEvent>`)
+- Tool 自动执行循环 (参考 OpenHarness `run_query()`)
+- 多 Provider 注册发现 (参考 OpenHarness `ProviderRegistry`)
+- 上下文压缩 (auto-compaction)
+- 生命周期钩子 (pre/post tool use hooks)