（已弃用）查询引擎 + Pydantic 输出#

Tip

本指南引用了在 RAG 工作流中提取结构化输出的已弃用方法。详情请查看我们的结构化输出入门指南。

使用 index.as_query_engine() 及其底层 RetrieverQueryEngine，我们可以在不增加额外 LLM 调用的情况下支持结构化 Pydantic 输出（与典型的输出解析器形成对比）。

每个查询引擎都支持通过 RetrieverQueryEngine 中的以下 response_mode 实现集成结构化响应：

refine
compact
tree_summarize
accumulate（测试版，需要额外解析转换为对象）
compact_accumulate（测试版，需要额外解析转换为对象）

底层实现会根据您设置的 LLM 使用 OpenAIPydanitcProgam 或 LLMTextCompletionProgram。如果存在中间 LLM 响应（例如在 refine 或包含多次 LLM 调用的 tree_summarize 过程中），Pydantic 对象会以 JSON 格式注入到下一个 LLM 提示中。

使用模式#

首先需要定义要提取的对象：

from typing import List
from pydantic import BaseModel


class Biography(BaseModel):
    """Data model for a biography."""

    name: str
    best_known_for: List[str]
    extra_info: str

然后创建查询引擎：

query_engine = index.as_query_engine(
    response_mode="tree_summarize", output_cls=Biography
)

最后获取响应并检查输出：

response = query_engine.query("Who is Paul Graham?")

print(response.name)
# > 'Paul Graham'
print(response.best_known_for)
# > ['working on Bel', 'co-founding Viaweb', 'creating the programming language Arc']
print(response.extra_info)
# > "Paul Graham is a computer scientist, entrepreneur, and writer. He is best known      for ..."

模块#

详细用法请参考以下笔记本：