xt/agcore

Fork 0

Files

T

徐涛 7f5513adf3 docs(prompt): 添加提示词工程方案设计文档

2026-06-02 22:19:15 +08:00

24 KiB

Raw Blame History

Phase 1: Prompt Engineering — 方案设计

定稿日期：2026-06-02

背景与目标

AG Core Phase 0（Foundation）已完成 LLM 调用周期的全部基础设施。Phase 1 的目标是补齐提示词工程能力，提供提示词的组合、模板化与优化能力，使其能直接服务于 Phase 2（工具系统）和 Phase 4（Agent 运行时）。

目标：实现 PromptTemplate（模板引擎）和 PromptComposer（提示词组合器）两个核心组件，覆盖变量插值、条件渲染、多消息序列拼接等场景。

需求分析

功能需求

模块	需求	验收条件
`PromptTemplate`	支持变量插值 `{{ var }}`	渲染后正确替换所有变量
`PromptTemplate`	支持条件渲染 `{{#if var}}...{{/if}}`	变量非空/非 false 时渲染块
`PromptTemplate`	支持列表循环 `{{#each list}}...{{/each}}`	遍历渲染集合元素
`PromptTemplate`	支持嵌套模板引用 `{{> template_name }}`	引用已注册的模板片段
`PromptTemplate`	支持部分转义（原样输出 `{{ literal }}`）	提供 raw block 语法
`PromptComposer`	按角色拼接 system/user/assistant 消息序列	输出 `Vec<OpenaiChatMessage>`
`PromptComposer`	支持插入预编译的 PromptTemplate	组合器中混合使用静态文本和模板
`PromptComposer`	支持从已有消息列表扩展	接收 `Vec<OpenaiChatMessage>` 作为初始状态
`PromptComposer`	支持多模态 ContentPart 构建（图片/音频/文件）	`user_content()`/`system_content()` 等接受 `OpenaiContentPart`
`PromptComposer`	支持 Developer 角色（o1 系列模型）	`developer()` / `developer_template()` / `developer_content()`
`PromptComposer`	支持 Tool 角色（工具执行结果回传）	`tool()` / `tool_content()` 接受 `tool_call_id`
`PromptComposer`	支持 `name` 字段设置（多角色区分）	`with_name()` 链式方法为上一消息设置名称
`PromptComposer`	支持消息序列合法性验证	`validate_messages()` 检查顺序约束与角色交替
`PromptTemplateRegistry`	模板注册表：按名称注册、查找、文件加载	从字符串/文件注册模板，按名称渲染
`PromptTemplateRegistry`	支持延迟编译模式	`register_lazy()` 存储原始字符串，首次渲染时编译
`PromptTemplate`	实现 `Display` trait	输出原始模板字符串（用于日志和调试）
`TemplateContext`	支持从 JSON 构造	`from_json()` 递归转换 `serde_json::Value` 为 `TemplateValue`

非功能需求

所有公开 API 必须带 /// 文档注释
无新增 unwrap() 调用
零运行时依赖（不使用 tera、askama 等模板引擎 crate）
模板引擎失败时返回结构化错误（PromptError）
与现有 OpenaiChatMessage / ChatRequest 类型自然集成

方案设计

模块结构

src/
  prompt.rs                # prompt 模块根：声明子模块 + 重导出公共 API
  prompt/
    template.rs            # PromptTemplate — 模板引擎
    template/
      registry.rs          # PromptTemplateRegistry — 模板注册表
    composer.rs            # PromptComposer — 提示词组合器
    error.rs               # PromptError — 错误类型

prompt.rs 根模块声明:

// prompt.rs
pub mod composer;
pub mod error;
pub mod template;

pub use composer::PromptComposer;
pub use error::PromptError;
pub use template::{PromptTemplate, PromptTemplateRegistry};

lib.rs 添加：

 pub mod llm;
+pub mod prompt;

1. 模板引擎选择：自建轻量

决策：不使用 tera / askama / maud / minijinja 等第三方模板 crate。

理由：

Phase 1 模板需求极其简单（变量插值 + 条件 + 列表循环），不需要 Jinja2/Handlebars 全能力
无依赖 = 编译更快、无安全面、版本冲突为 0
自建 50-80 行核心逻辑即可覆盖所有需求
Roadmap 估算 400 行总代码，60 行模板引擎足够

语法设计（参考 Handlebars 子集）：

{{ variable_name }}           → 变量插值
{{#if var}}...{{/if}}         → 条件渲染（var 存在且非空）
{{#if var}}...{{else}}...{{/if}} → 条件 + 否则
{{#each list}} {{item}} {{/each}} → 列表循环
{{#raw}} {{literal}} {{/raw}}  → 原始块（不解析内部模板语法）
{{> template_name}}           → 引用已注册的嵌套模板

2. PromptTemplate — 模板引擎

// prompt/template.rs

use std::collections::HashMap;
use crate::prompt::error::PromptError;

/// 渲染上下文中使用的值类型。
pub enum TemplateValue {
    String(String),
    Bool(bool),
    Array(Vec<TemplateValue>),
    Object(HashMap<String, TemplateValue>),
}

/// `TemplateValue` 自动转换（提升 `ctx.insert("name", "Alice")` 的易用性）。
impl From<String> for TemplateValue { ... }
impl From<&str> for TemplateValue { ... }
impl From<bool> for TemplateValue { ... }

/// 模板变量上下文。
pub struct TemplateContext {
    vars: HashMap<String, TemplateValue>,
}

impl TemplateContext {
    pub fn new() -> Self;

    /// 插入变量（支持 `&str` / `String` / `bool` 自动转换）。
    pub fn insert(&mut self, key: impl Into<String>, value: impl Into<TemplateValue>);

    pub fn get(&self, key: &str) -> Option<&TemplateValue>;

    /// 从 `serde_json::Value` 递归构造（支持嵌套 Object/Array）。
    pub fn from_json(value: &serde_json::Value) -> Result<Self, PromptError>;

    /// 从 `HashMap` 构造（适用于配置加载场景）。
    pub fn from_map(map: HashMap<String, TemplateValue>) -> Self;
}

/// 预编译的模板。
pub struct PromptTemplate {
    /// 原始模板字符串（用于 debug）。
    raw: String,
    /// 编译后的 AST 片段。
    fragments: Vec<Fragment>,
}

/// 编译后的 AST 节点（内部枚举）。
enum Fragment {
    Literal(String),
    Variable { name: String },
    If { condition: String, body: Vec<Fragment>, else_body: Vec<Fragment> },
    Each { variable: String, body: Vec<Fragment> },
    Raw(String),
    Include(String),
}

impl PromptTemplate {
    /// 从模板字符串编译。
    pub fn compile(template: &str) -> Result<Self, PromptError>;

    /// 使用上下文渲染。
    pub fn render(&self, ctx: &TemplateContext) -> Result<String, PromptError>;

    /// 注册可引用的子模板。
    pub fn register_partial(&mut self, name: &str, template: PromptTemplate);
}

/// `Display` 输出原始模板字符串，便于日志和调试。
impl fmt::Display for PromptTemplate {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        write!(f, "{}", self.raw)
    }
}

编译流程：

逐字符扫描模板字符串
遇到 {{ 时解析指令类型（variable/if/each/raw/include）
生成 Fragment AST 列表
返回 PromptTemplate { raw, fragments }

渲染流程：

遍历 fragments
Literal → 直接追加
Variable { name } → ctx.get(name) → 追加字符串值
If { condition, body, else_body } → 判断 ctx.get(condition) 是否存在且真 → 递归渲染 body/else_body
Each { variable, body } → ctx.get(variable) 转为数组 → 为每个元素设置 item 变量 → 递归渲染 body
Raw(text) → 原样追加（不解析 {{ }}）
Include(name) → 查找已注册的 partials → 递归渲染

3. PromptTemplateRegistry — 模板注册表

提供轻量的模板管理能力，按名称注册、查找、从文件加载，适用于管理多个 Agent 的提示词模板。

// prompt/template/registry.rs

use std::collections::HashMap;
use std::path::Path;
use crate::prompt::error::PromptError;
use crate::prompt::template::{PromptTemplate, TemplateContext};

/// 内部存储的模板（支持延迟编译）。
enum StoredTemplate {
    Compiled(PromptTemplate),
    Raw(String),
}

/// 模板注册表——管理多模板实例。
pub struct PromptTemplateRegistry {
    templates: HashMap<String, StoredTemplate>,
}

impl PromptTemplateRegistry {
    pub fn new() -> Self;

    /// 从模板字符串编译并注册（立即编译）。
    pub fn register(&mut self, name: &str, template: &str) -> Result<(), PromptError>;

    /// 延迟编译注册：只存储原始字符串，首次渲染时编译。
    /// 适合模板数量多但并非全部立即使用的场景。
    pub fn register_lazy(&mut self, name: &str, template: &str);

    /// 从文件读取并编译注册。
    pub fn register_file(&mut self, name: &str, path: &Path) -> Result<(), PromptError>;

    /// 获取已注册的模板（延迟编译的模板在此首次编译）。
    pub fn get(&mut self, name: &str) -> Result<&PromptTemplate, PromptError>;

    /// 按名称渲染（`{{> name }}` 引用时自动查找）。
    pub fn render(&mut self, name: &str, ctx: &TemplateContext) -> Result<String, PromptError>;
}

设计约束：

不设全局单例，用户自行创建和持有
不引入文件系统监听、热加载、序列化等复杂能力
注册表内部模板可用于解析 {{> partial_name }} 子模板引用
用户仍可单独持有 PromptTemplate 实例，不强制使用注册表

4. PromptComposer — 提示词组合器

// prompt/composer.rs

use crate::llm::types::message::{OpenaiChatMessage, OpenaiContentPart, ContentField};
use crate::prompt::error::PromptError;
use crate::prompt::template::{PromptTemplate, TemplateContext};

/// 提示词组合器——构建多角色消息序列。
pub struct PromptComposer {
    messages: Vec<OpenaiChatMessage>,
    pending_name: Option<String>,
}

impl PromptComposer {
    /// 创建一个空的组合器。
    pub fn new() -> Self;

    /// 从已有的消息列表初始化。
    pub fn from_messages(messages: Vec<OpenaiChatMessage>) -> Self;

    // ===== 纯文本消息 =====

    /// 添加一条纯文本 system 消息。
    pub fn system(mut self, text: impl Into<String>) -> Self;

    /// 添加一条纯文本 user 消息。
    pub fn user(mut self, text: impl Into<String>) -> Self;

    /// 添加一条纯文本 assistant 消息。
    pub fn assistant(mut self, text: impl Into<String>) -> Self;

    /// 添加一条纯文本 developer 消息（o1 系列模型使用）。
    pub fn developer(mut self, text: impl Into<String>) -> Self;

    /// 添加一条 Tool 消息（工具执行结果回传）。
    pub fn tool(mut self, tool_call_id: impl Into<String>, content: impl Into<String>) -> Self;

    // ===== 模板消息 =====

    /// 使用模板和上下文渲染后添加为 user 消息。
    pub fn user_template(
        mut self,
        template: &PromptTemplate,
        ctx: &TemplateContext,
    ) -> Result<Self, PromptError>;

    /// 使用模板和上下文渲染后添加为 system 消息。
    pub fn system_template(
        mut self,
        template: &PromptTemplate,
        ctx: &TemplateContext,
    ) -> Result<Self, PromptError>;

    /// 使用模板和上下文渲染后添加为 assistant 消息。
    pub fn assistant_template(
        mut self,
        template: &PromptTemplate,
        ctx: &TemplateContext,
    ) -> Result<Self, PromptError>;

    /// 使用模板和上下文渲染后添加为 developer 消息。
    pub fn developer_template(
        mut self,
        template: &PromptTemplate,
        ctx: &TemplateContext,
    ) -> Result<Self, PromptError>;

    // ===== 多模态 ContentPart =====

    /// 添加一条含指定 ContentPart 的 system 消息。
    pub fn system_content(mut self, part: OpenaiContentPart) -> Self;

    /// 添加一条含指定 ContentPart 的 user 消息。
    pub fn user_content(mut self, part: OpenaiContentPart) -> Self;

    /// 添加一条含指定 ContentPart 的 assistant 消息。
    pub fn assistant_content(mut self, part: OpenaiContentPart) -> Self;

    /// 添加一条含指定 ContentPart 的 developer 消息。
    pub fn developer_content(mut self, part: OpenaiContentPart) -> Self;

    /// 添加一条含指定 ContentPart 的 Tool 消息。
    pub fn tool_content(mut self, tool_call_id: impl Into<String>, part: OpenaiContentPart) -> Self;

    /// 批量添加 ContentPart 作为 user 消息。
    pub fn user_contents(mut self, parts: Vec<OpenaiContentPart>) -> Self;

    /// 批量添加 ContentPart 作为 system 消息。
    pub fn system_contents(mut self, parts: Vec<OpenaiContentPart>) -> Self;

    /// 批量添加 ContentPart 作为 assistant 消息。
    pub fn assistant_contents(mut self, parts: Vec<OpenaiContentPart>) -> Self;

    /// 批量添加 ContentPart 作为 developer 消息。
    pub fn developer_contents(mut self, parts: Vec<OpenaiContentPart>) -> Self;

    // ===== 角色标识 =====

    /// 为上一条添加的消息设置 `name` 字段（多 agent 系统中区分同角色实体）。
    pub fn with_name(mut self, name: impl Into<String>) -> Self;

    // ===== 构建 =====

    /// 构建最终的消息列表。
    pub fn build(self) -> Vec<OpenaiChatMessage>;

    /// 构建并直接创建 ChatRequest（需搭配 model 参数）。
    /// 返回的 `OpenaiChatRequest` 中 `tools`、`temperature`、`max_tokens` 等字段均为 `None`，
    /// 可通过结构体更新语法补全：`OpenaiChatRequest { tools: Some(...), ..req }`。
    pub fn build_request(
        self,
        model: impl Into<String>,
    ) -> crate::llm::types::request::OpenaiChatRequest;
}

/// 验证消息序列是否符合 OpenAI API 要求。
/// 检查项：Tool 消息必须紧跟在匹配的 Assistant 消息后、角色交替规则等。
pub fn validate_messages(messages: &[OpenaiChatMessage]) -> Result<(), PromptError>;

Builder 模式设计：

PromptComposer 采用链式调用（builder pattern），与 Rust 生态的主流风格一致
每个 system() / user() / assistant() / developer() / tool() 方法返回 Self，支持连续调用
with_name() 作用于上一条消息，内部通过 pending_name: Option<String> 暂存，push 消息时消费
build() 返回 Vec<OpenaiChatMessage>，build_request() 创建完整的 OpenaiChatRequest
ContentField 类型约定：所有纯文本消息（system() / user() / assistant() / developer()）统一使用 ContentField::Array(vec![OpenaiContentPart::Text{...}])，与 OpenAI API 非流式响应的标准格式一致

validate_messages() 校验规则：

Tool 角色的消息必须跟在 Assistant 角色且含 tool_calls 的消息之后
禁止连续出现两条同角色的非 Tool 消息（system 除外）
消息列表不能为空

5. PromptError — 错误类型

// prompt/error.rs

use thiserror::Error;

#[derive(Error, Debug)]
pub enum PromptError {
    #[error("模板解析错误: {0}")]
    Parse(String),

    #[error("渲染错误: 变量 '{0}' 未找到")]
    VariableNotFound(String),

    #[error("渲染错误: 引用的子模板 '{0}' 未注册")]
    PartialNotFound(String),

    #[error("渲染错误: '{0}' 不是数组，无法遍历")]
    NotAnArray(String),

    #[error("渲染递归超过最大深度限制 ({0})")]
    MaxDepthReached(u8),

    #[error("渲染错误: {0}")]
    Render(String),

    #[error("消息序列校验失败: {0}")]
    InvalidSequence(String),

    #[error("文件读取错误: {0}")]
    Io(#[from] std::io::Error),
}

6. 使用示例

use agcore::prompt::{PromptComposer, PromptTemplate, TemplateContext};

// 编译模板
let tpl = PromptTemplate::compile(
    "你是{{role}}。请回答以下问题：\n{{#if context}}参考背景：{{context}}\n{{/if}}提问：{{question}}"
)?;

// 构建上下文
let mut ctx = TemplateContext::new();
ctx.insert("role", "资深工程师");
ctx.insert("question", "Rust 的所有权规则是什么？");
ctx.insert("context", "用户有 Java 背景");

// 使用组合器构建消息序列
let messages = PromptComposer::new()
    .system("你是一个专业的 Rust 助手")
    .user_template(&tpl, &ctx)?
    .build();

// 可选：直接构建 ChatRequest
let request = PromptComposer::new()
    .system("你是一个翻译助手")
    .user("Hello, world!")
    .build_request("gpt-4o");

7. 与 LlmCycle 的集成

PromptComposer::build() 输出 Vec<OpenaiChatMessage>，但 LlmCycle.messages 是私有字段，无法直接赋值。因此需要对 LlmCycle 进行扩展，提供消息注入入口，使 Composer 能和 LlmCycle 的多轮对话循环协同工作。

方案：在 LlmCycle 上新增 3 个方法：

// llm/cycle.rs

impl LlmCycle {
    /// 直接设置消息历史（覆盖已有消息），支持 Builder 链式调用。
    pub fn with_messages(mut self, messages: Vec<Message>) -> Self;

    /// 追加消息到历史尾部。
    pub fn extend_messages(&mut self, messages: Vec<Message>);

    /// 使用预构建消息提交（跳过自动 push user prompt）。
    /// 与 submit() 不同，不自动添加 user_text(prompt)。
    pub async fn submit_messages(
        &mut self,
        messages: Vec<Message>,
        tools: Vec<ToolDefinition>,
    ) -> Result<ChatResponse, LlmError>;
}

submit_messages() 与 submit() 的区别：

维度	`submit()`	`submit_messages()`
输入	`prompt: String` + `tools`	`messages: Vec<Message>` + `tools`
内部操作	自动 push `user_text(prompt)`	不自动添加任何消息
适用场景	简单单轮对话	多轮/预构建消息序列
system_prompt 处理	自动插入（如果无 System 消息）	完全由调用方控制

system_prompt 冲突处理：submit_messages() 关闭 LlmCycle 的自动 system prompt 插入逻辑，避免与 PromptComposer 已构建的 System/Developer 消息重复。由调用方全权控制消息序列内容。

使用示例：

let messages = PromptComposer::new()
    .system("你是一个专业的 Rust 助手")
    .user_template(&query_tpl, &ctx)?
    .build();

let mut cycle = LlmCycle::new(provider, config)
    .with_messages(messages);

let resp = cycle.submit_messages(vec![], tools).await?;

PromptComposer::build_request() 直接创建 OpenaiChatRequest，可用于绕过 LlmCycle 直接调用 LlmProvider 的场景。

注意：PromptComposer 模块不直接依赖 LlmCycle（避免 prompt → cycle 的强耦合）。集成方法全部在 LlmCycle 侧实现，保持单一职责。

实现计划

Step 1: 创建方案文档

创建 docs/4-prompt-engineering.md（即本文档）。

Step 2: PromptError

创建 src/prompt/error.rs
定义 PromptError 枚举（Parse / Render / VariableNotFound / PartialNotFound）

Step 3: PromptTemplate

创建 src/prompt.rs（模块根）
创建 src/prompt/template.rs
实现编译（compile()）：逐字符扫描 → 生成 Vec<Fragment>
- 注意处理：嵌套 #if 栈匹配、{{ 字面量转义、未闭合标签检测
实现渲染（render()）：递归遍历 Fragment → 拼接字符串
- 定义非字符串值渲染格式：Bool→"true"/"false"、Array→JSON、Object→JSON
- 递归加深度限制（16 层）防止循环引用
支持功能：变量插值、条件渲染、列表循环、原始块、子模板引用
实现 Display for PromptTemplate（输出原始模板字符串）
编写 20+ 边界测试覆盖：嵌套 if/each、未闭合标签、空变量、空数组 each、递归深度超限
运行 cargo test + cargo check 验证

Step 4: PromptTemplateRegistry

推荐选项：template 保持单文件，PromptTemplateRegistry 合并同文件（~40 行不值得单独目录）
内部存储使用 StoredTemplate 枚举（支持 Compiled 和 Raw 两种状态）
实现：register() 立即编译、register_lazy() 延迟编译、register_file() 文件加载
实现 get() / render()（延迟编译的模板首次渲染时编译）
运行 cargo check 验证

Step 5: PromptComposer

创建 src/prompt/composer.rs
实现 Builder 链式 API，内部维护 messages: Vec<OpenaiChatMessage> + pending_name: Option<String>
角色方法：system() / user() / assistant() / developer() / tool()
- 纯文本消息统一使用 ContentField::Array([Text])
- tool() 需传入 tool_call_id 和 content
模板方法：system_template() / user_template() / assistant_template() / developer_template()
多模态方法：*_content()（单个 ContentPart）和 *_contents()（批量）
with_name()：作用于上一条消息的 name 字段
build() / build_request()
validate_messages()：独立的纯函数，校验 Tool→Assistant 顺序、角色交替、非空
运行 cargo check 验证

Step 6: LlmCycle 扩展

cycle.rs 新增 3 个方法：
- with_messages(self, messages: Vec<Message>) -> Self — 链式设置消息历史
- extend_messages(&mut self, messages: Vec<Message>) — 追加消息
- submit_messages(&mut self, messages: Vec<Message>, tools: Vec<ToolDefinition>) -> Result<...> — 预构建消息提交
submit_messages() 关闭自动 system_prompt 插入（避免与 Composer 的 System/Developer 消息重复）
运行 cargo check 验证

Step 7: lib.rs 注册

lib.rs 添加 pub mod prompt;
运行 cargo check 验证

Step 8: 收尾

cargo clippy — 无警告
cargo build — 完整构建
检查所有新公开 API 有 /// 文档注释

术语表

术语	说明
`TemplateValue`	模板渲染上下文中使用的值类型枚举
`TemplateContext`	模板变量上下文，持有所有变量
`PromptTemplate`	预编译的模板，持有 AST 片段列表
`Fragment`	编译后的 AST 节点（内部枚举）
`PromptComposer`	提示词组合器，构建多角色消息序列
`PromptError`	提示词工程专属错误类型

风险评估

风险	概率	缓解措施
模板语法不支持复杂场景（如嵌套 each）	低	当前需求不涉及；后续可引入 tera 替换
自定义模板引擎解析有 bug	中	编写单元测试覆盖所有语法分支
与未来 Prompt Optimizer 冲突	低	PromptOptimizer 只修改模板/上下文，不改模板引擎接口
条件语义不明确（什么是"假"值）	低	明确定义：None / false / 空字符串 / 空数组均为假
子模板循环引用导致栈溢出	低	渲染时加递归深度限制（16 层）或已注册集合去重检测

验收标准

cargo check 编译通过
cargo clippy 无警告
模块文件路径正确：src/prompt.rs + src/prompt/{template,composer,error}.rs
PromptTemplate::compile() 能解析含变量/条件/循环的模板
PromptTemplate::render() 正确渲染所有语法
PromptTemplate 实现 Display trait，输出原始模板字符串
TemplateContext 提供 from_json() / from_map() 构造方式，支持 From<&str> 自动转换
PromptTemplateRegistry 支持立即编译（register()）、延迟编译（register_lazy()）、文件加载（register_file()）
PromptComposer 支持链式调用，覆盖 System / User / Assistant / Developer / Tool 五种角色
PromptComposer 支持 user_content() / system_content() / assistant_content() / developer_content() / tool_content() 多模态方法
PromptComposer 支持 with_name() 设置消息角色标识
PromptComposer::build_request() 能创建 OpenaiChatRequest
validate_messages() 能校验消息序列合法性
LlmCycle 新增 with_messages() / extend_messages() / submit_messages() 支持 Composer 集成
lib.rs 包含 pub mod prompt;
所有新公开 API 有文档注释

24 KiB Raw Blame History Unescape Escape