通过结构化答案过滤优化回答¶

在使用 Refine 响应合成器进行回答合成时，过滤非答案内容至关重要。常见问题是类似"我不知道答案"这样的无用回答会持续传播，这种回答可能在合成过程中一直存在，最终导致生成同样无意义的最终答案。即使在其他更相关的部分存在实际答案时，这种情况仍可能发生。

通过将 structured_answer_filtering 参数设置为 True 可以过滤这些无用回答。该参数默认设置为 False，因为目前该功能仅在您使用支持函数调用的 OpenAI 模型时才能发挥最佳效果。

如果您在 Colab 上打开此 Notebook，可能需要安装 LlamaIndex 🦙。

In [ ]:

Copied!

%pip install llama-index-llms-openai
%pip install llama-index-llms-openai

In [ ]:

Copied!

!pip install llama-index
!pip install llama-index

加载数据¶

In [ ]:

Copied!





texts = [
    "The president in the year 2040 is John Cena.",
    "The president in the year 2050 is Florence Pugh.",
    'The president in the year 2060 is Dwayne "The Rock" Johnson.',
]
texts = [
    "The president in the year 2040 is John Cena.",
    "The president in the year 2050 is Florence Pugh.",
    'The president in the year 2060 is Dwayne "The Rock" Johnson.',
]

总结¶

In [ ]:

Copied!

import os

os.environ["OPENAI_API_KEY"] = "sk-..."
import os

os.environ["OPENAI_API_KEY"] = "sk-..."

In [ ]:

Copied!

from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo-0613")
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo-0613")

In [ ]:

Copied!

from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine", llm=llm, verbose=True
)
from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine", llm=llm, verbose=True
)

In [ ]:

Copied!

response = summarizer.get_response("who is president in the year 2050?", texts)
response = summarizer.get_response("who is president in the year 2050?", texts)

> Refine context: The president in the year 2050 is Florence Pugh.
> Refine context: The president in the year 2060 is Dwayne "The R...

失败结果¶

如您所见，我们未能从输入的 texts 字符串中获取正确答案，因为初始的"我不知道"应答一直传播到响应合成的最终环节。

In [ ]:

Copied!

print(response)
print(response)

I'm sorry, but I don't have access to information about the future.

现在我们尝试将 structured_answer_filtering=True 再次运行

In [ ]:

Copied!





from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine",
    llm=llm,
    verbose=True,
    structured_answer_filtering=True,
)
from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine",
    llm=llm,
    verbose=True,
    structured_answer_filtering=True,
)

In [ ]:

Copied!

response = summarizer.get_response("who is president in the year 2050?", texts)
response = summarizer.get_response("who is president in the year 2050?", texts)

Function call: StructuredRefineResponse with args: {
  "answer": "It is not possible to determine who the president is in the year 2050 based on the given context information.",
  "query_satisfied": false
}
> Refine context: The president in the year 2050 is Florence Pugh.
Function call: StructuredRefineResponse with args: {
  "answer": "Florence Pugh",
  "query_satisfied": true
}
> Refine context: The president in the year 2060 is Dwayne "The R...
Function call: StructuredRefineResponse with args: {
  "answer": "Florence Pugh",
  "query_satisfied": false
}

成功结果¶

如您所见，我们通过筛选texts字符串中实际包含问题答案的部分，成功从给定上下文中确定了正确答案。

In [ ]:

Copied!

print(response)
print(response)

Florence Pugh

非函数调用型大语言模型¶

当您使用的 LLM 不提供函数调用 API 时，可能仍希望利用此过滤功能。

在这种情况下，Refine 模块将自动切换至使用结构化输出 Program，该方案不依赖外部函数调用 API。

In [ ]:

Copied!

# we'll stick with OpenAI but use an older model that does not support function calling
instruct_llm = OpenAI(model="gpt-3.5-turbo-instruct")
# we'll stick with OpenAI but use an older model that does not support function calling
instruct_llm = OpenAI(model="gpt-3.5-turbo-instruct")

In [ ]:

Copied!





from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine",
    llm=instruct_llm,
    verbose=True,
    structured_answer_filtering=True,
)
from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine",
    llm=instruct_llm,
    verbose=True,
    structured_answer_filtering=True,
)

In [ ]:

Copied!

response = summarizer.get_response("who is president in the year 2050?", texts)
print(response)
response = summarizer.get_response("who is president in the year 2050?", texts)
print(response)

Florence Pugh

`CompactAndRefine`¶

由于 CompactAndRefine 构建于 Refine 基础之上，该响应模式同样支持结构化答案过滤。

In [ ]:

Copied!





from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="compact",
    llm=instruct_llm,
    verbose=True,
    structured_answer_filtering=True,
)
from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="compact",
    llm=instruct_llm,
    verbose=True,
    structured_answer_filtering=True,
)

In [ ]:

Copied!

response = summarizer.get_response("who is president in the year 2050?", texts)
print(response)
response = summarizer.get_response("who is president in the year 2050?", texts)
print(response)

Florence Pugh

通过结构化答案过滤优化回答¶

加载数据¶

总结¶

失败结果¶

成功结果¶

非函数调用型大语言模型¶

CompactAndRefine¶

`CompactAndRefine`¶